日韩av片子_国产自在自线午夜精品视频在_使劲快高潮了国语对白在线_久久免费毛片大全_激情丁香综合_欧美成人精品欧美一级乱黄码

中培偉業IT資訊頻道
您現在的位置:首頁 > IT資訊 > 學習交流 > The Best Practices of Enterprise-level Data Center Construction

The Best Practices of Enterprise-level Data Center Construction

2017-07-28 16:27:09 | 來源:中培企業IT培訓網

At present, most of data centers used by engineering enterprises are built by using traditional technology with several disadvantages, which contain high construction cost, weak scalability and limited capacities of calculation and analysis. To meet the need of data storage, processing, analysis and application based on big data, enterprise-level data centers, which combined with many technologies such as parallel computing, analysis of large-scale data, linear expansion, support of all types of data, are able to effectively achieve the centralized integration and analysis of data resources in all of businesses, levels and types.

At present, data centers built by most enterprises in engineering industry accumulate a large amount of structured data, unstructured data, geographic information data and massive real-time data. At the same time, most of them use centralized server architectures (such as Oracle Rac), which leads to weak scalability, so that it cannot meet the increasing need of data storage. Besides, data processing is mainly based on single-point models, lacking the capacity of real-time parallel computing, so that it cannot meet the need of processing the massive data in real time. Data storage and processing can only cope with structure data; it cannot effectively stores, processes or analyzes unstructured data; it cannot provide the service of data storage and processing in all directions and types under the environment of big data; it cannot support the deep analysis of data.

The overall structure of enterprise-level data centers in engineering industry based on big data is shown in Figure 1. According to the layers, it can be divided into seven layers, including data source layer, data integration layer, data storage layer, analysis/service layer, business application layer, front end access layer, overall data management platform.

 

Figure1. The Overall Structure of Enterprise-level Data Centers in Engineering IndustryBased on Big Data

By using interface tables, interface files, data reception services and data information reception, data centers can achieve the acquisition of structured data, unstructured data and real-time data to meet the requirement of different data timeliness. In the data storage layer, data centers contain data storage platforms, distributed data platforms and streaming data platforms to store data with different characteristics and provide the related data services. Data centers provide the integrated result data through the ways like push in batches and real-time data service, and meet the requirements of data sharing and data application through the ways of asynchronous data push. Besides, data centers achieve the functions of comprehensive information display and functional analysis and decision-making, and meet the requirement of displaying all kinds of analysis results in front ends through integrated display in various front ends (such as PC terminal, large screen terminal and mobile terminal). Meanwhile, data centers provide data resource management, which means managing metadata, data quality, data standards, data models and data resources in data centers.

Data Integration Layer:

[including data acquisition and job scheduling]

Data acquisition

Data acquisition refers to delivering the structured data, unstructured data and real-time data of the collecting source systems. It contains interface table processing, message reception processing, data reception processing, real-time data acquisition processing and unstructured files processing.

Job scheduling

Job scheduling can achieve the scheduling of structured data, unstructured data and real-time data, the operation of inner data in data centers (including ETL, MapReduce, Sqoop, etc.), and the unified centralized scheduling of jobs pushed to each target system by data. It implements scheduling engines, provides the automatic and manual adjustment mode of the job and controls the execution order of the job based on the job dependency configuration information. Meanwhile, it controls the concurrency of the job and records the running results and the logs of the job.

Data Storage Layer:

[It contains traditional data repository platforms based on relational database as well as distributed data platforms and streaming data platforms based on Hadoop ecosystem, which can store different data and provide different data services.]

The data repository platform

The data repository platform uses hierarchical design, divided itself into buffer layer, integration layer, summary layer and market layer.

Buffer layer stores data collected from source systems by data centers. It can share the pressure of distributing data in bulk and in real time in source systems, avoiding the problems of performance pressure, jet lag of different versions, developing for many times, redundancy storage because of getting data repeatedly. Meanwhile, as a kind of data source, it can avoid the influence to data integration layer and summary layer because of the changes of the original systems (such as data structure, time window).

Integration layer is the business data after data cleaning, conversion and integration, which is the core data layer in data centers.

Summary layer forms statistic and aggregate enterprise data according to the subject dimension; it can form aggregate data according to the requirement of processing the subject reports; the storage of aggregate data is formed by storing aggregate data according to main body and calculating business data through the dimensions of data, main body and processing types .

Data market layer is the analytical data set for specific business units (such as business departments). The data in the layer is mainly based on the data of integration layer and summary layer, which also contains the specific analytical data supporting targets.

The distributed data platform

The distributed data platform mainly stores the following types of data: massive structured data, unstructured file data and dumping data of streaming data and relational database which are difficult to store in traditional relational database. According to the data storage requirement and the characteristic of distributed platform technology component, the platform can be divided into HBase-based data storage area and Hive-based data storage area.

Unstructured data layer stores the unstructured data from all source systems, which contains office documents, design drawings, text files, image files, etc.

Massive structured data layer stores the massive structured data from all structured systems.

Dumping layer of streaming data stores the periodic dumping data from streaming data platforms, help streaming data platforms to achieve the persistent storage of real-time data.

The streaming data platform

The streaming data platform includes real-time data integration layer, real-time data summary layer and business data buffer layer.

As for real-time data integration layer, in the integration layer of streaming data platform, the entry end of source systems uniformly use the way of Socket communication to interact to avoid the inconsistency of the data source. The data center systems monitor Socket of source systems. When there is data in source systems, the monitor procedures obtain the data and write the source information of monitored data in the corresponding message queue.

As for real-time data summary layer, it processes the source data of message queue in integration layer by using Storm in the way of streaming data. Besides, it aggregates, calculates and stores data according to the business requirements.

As for business data buffer layer, when the calculation to streaming data by Storm is finished, it can figure out the result data according to the specific business logic. (The corresponding architecture is shown in Figure 2).

 

Analysis/Service Layer:

It includes comprehensive information display platform, intelligent analysis and decision-making platform and data services (shown in Figure 3)

Comprehensive information display platform

Comprehensive information display platform, based on data storage layer, is an application including report query and comprehensive analysis to achieve the dynamic configuration to analysis of the page content, layout, components, CCTV, linkage relations, etc.

Intelligent analysis and decision-making platform

Intelligent analysis and decision-making platform includes several modules, such as data loading, data preprocessing, data mining algorithm, analysis model management and model operation scheduling. It provides technical support for data understanding, data preprocessing, algorithm modeling, model evaluation, model application, etc. Besides, to meet the requirement of big data analysis, it digs algorithms library combined with big data (It includes three types of mining algorithms. They are descriptive mining algorithm such as clustering analysis and correlation analysis, predictive mining algorithm such as classification analysis, evolution analysis and heterogeneous analysis as well as the mining algorithm of dedicated data analysis such as text analysis, speech analysis, image analysis, video analysis, etc.)

Data services

Data services mainly achieve real-time data services, subscription, release, batch data services, etc. Besides, it provides the cache function to enhance the overall performance of the system.

 

Data Management Layer:

It includes functions of metadata management, data quality management, main data management, data standard management and centralized job scheduling and monitoring (shown in Figure 4).

Metadata management

It can achieve the rapid search, acquisition, use and sharing to metadata in data centers. Besides, it can provide metadata support for data centers data sharing and exchange, multidimensional analysis, assistant decision making, data mining, etc.

Data quality management

It can achieve the normalized quality audit of data in data centers and ensure the real-time, complete and compliance of data receiving in business systems.

Main data management

It can achieve the unified management, application and maintenance of main data like materials, projects and contracts to ensure the consistency and stability of main data modification.

Data standard management

It can achieve the unified management of standard documents in data centers.

Centralized job scheduling and monitoring

It can achieve the unified dispatching management and monitoring of ETL interface operations and big data operations.

 

With the development of information level in engineering industry, the information systems have been fully integrated into all aspects of the businesses of enterprise production and management, which have accumulated a large number of structured data, unstructured data, geographic information data and massive real-time data. As a result, using big data-based enterprise-level data centers can make up the disadvantages of traditional technology, solve the problems of weak expansibility, high construction costs and limited capacities of calculation, analysis and mining and meet the requirement of storage, processing, analysis and application of all types of data under the environment of big data.


相關閱讀

主站蜘蛛池模板: 99久久人妻精品免费二区 | 国产黄色大片免费在线观看 | 亚洲男人的天堂AV手机在线观看 | 日本一区二区免费高清视频 | 99美剧网 | www日本在线播放 | 国产高清av在线播放 | 黄色免费在线观看网站 | 国产精品久久久福利 | 天天操天天操天天操天天操 | 亚洲精品一区二区三 | 国偷自产一区二区免费视频 | 青青草国产在线观看 | 美女视频黄频大全免费的 | 天天都色 | 欧美在线视频第一页 | 把女人弄爽特黄a大片 | 国产AV无码专区亚洲AV | 888黄色片 | 少妇高潮太爽了在线观看免费 | 亚洲日本无码一区二区三区 | 久久久亚洲精品动漫无码 | yourporn久久久亚洲精品 | 琪琪在线中文字幕 | 精品国产31久久久久久 | 国内精品影视无广告 | 日本少妇无码精品12P | 97精华最好的产品在线 | 污污的网站视频 | 一级做a爰片久久毛片免费陪 | 一级毛片在线视频 | 人人爱夜夜爽日日做蜜桃 | 激情视频免费网站 | 国产精品亚洲一区二区三区在线 | 中文字幕在线观看国产 | 99热在线观看精品 | 国产00粉嫩馒头一线天萌白酱 | 欧洲成人免费视频 | 亚洲精品一区二区18 | 免费播放片高清在线观看av | 一本色道av久久精品 |