Data knowledge systems are a broad and deep area that covers the entire process from data collection, processing, analysis to application. The following is a summary of the main components of the data knowledge system:
I. Knowledge systems for data management
The knowledge system for data management, mainly organized by the international association for data management (dama), provides an in-depth account of the full knowledge system in all areas of data management. The dama knowledge system framework (e. G. Dama-dmbok2) consists mainly of:
Data governance: meeting enterprise data needs through the establishment of decision-making systems that provide guidance and oversight for data management. Data architecture: a blueprint for defining and managing data assets, establishing strategic data needs and overall design to meet them. Data modelling and design: analysis, demonstration and communication of data needs in the exact form of data models. Data storage and operation: operating activities in the data life cycle aimed at maximizing the value of the data. Data security: ensure data privacy and confidentiality, maintain data integrity and provide appropriate data access. Dataset integration and interoperability: includes processes related to data storage, data movement and integration between applications and organizations. Documentation and content management: managing the life cycle of unstructured media data and information and supporting compliance with sexuality requirements. Reference and master data: including coordination and maintenance of core shared data to ensure consistency of information among key operational entities. Data warehouse and business intelligence: manage decision-making to support data and enable knowledge workers to gain value from data through analysis. Metadata: planning, implementing and controlling activities to enable access to high-quality integrated metadata. Data quality: planning and implementing quality management techniques, measuring, assessing and improving data applicability。
In addition, environmental elements such as goals and principles, activities, key deliverables, roles and responsibilities, practices and methods, technology, organization and culture are included。
Data scientific knowledge system
The scientific knowledge system for data is based mainly on the theoretical basis of statistics, machine learning, data visualization and field knowledge, and its main studies include the theory of the scientific basis of data, data pre-processing, data computing and data management. The technologies and tools used in data science are of a certain professional nature, such as the r language, which is one of the tools commonly used by data scientists。
Iii. A knowledge system for data analysis
The knowledge system for data analysis covers data-based thinking, methodology, tools for basic data analysis and operational knowledge and practical experience in data analysis. It emphasizes the full process of data analysis, including data collection, data cleansing, data analysis, data visualization and report writing. The knowledge system for data analysis also includes commonly used data analysis tools such as excel and python, as well as methodological approaches for data analysis such as multi-dimensional analysis and funnel mapping. In addition, the knowledge system for data analysis focuses on fostering digital thinking and thinking methods such as structured thinking, schematic expression and operationalization。
Other relevant knowledge
In constructing a data knowledge system, the following also need to be considered:
Statistical principles: in-depth learning of statistical principles is essential for understanding data distribution, identifying patterns and anomalies in data. This includes such elements as probability distribution, hypothetical testing and confidence interval. Machine learning algorithms: machine learning algorithms can help to learn and discover patterns from a large amount of data, leading to prediction or classification. These include types of supervised learning, non-supervised learning and intensive learning. Operational applications: applying the knowledge acquired to practical projects is the best way to test and upgrade data analysis capacity. For example, applications in the fields of electrical data analysis, financial risk assessment, etc。





