FinalPortfolio Project
Ankitha Pagadala
ITS 831
11/25/2022
Data warehouse
Data warehouse architecture is considered the design of an organization of the different data collections under consideration that is to be stored under a specific framework. The data warehouse is noted to have other components that are reported to be diversified, including data warehouse database, metadata, query tools, data warehouse bus architecture, and the acquisition, sourcing, clean up, and transformation tool (ETL) (Khan, 2019).
Metadata
It is considered the type of data component that generally defines the data warehouse, which is used for managing, building, and maintaining the data warehouse. The metadata is also noted to display a role of specifying the data source, the usage, features, and values relating to the data warehouse. Also, it defines how data would be changed and the process of processing the data (Khan, 2019).
ETL tools component
In this tool, one should note that the data sourcing, migration, and transformation tools are used in carrying out functions such as summarizations, conversions, and performing any other changes needed in the transformation of the unified data formats, which would be under consideration in the data warehouse.
Query tools components
This tool is noted to be allowing users to interact positively with the data types under consideration in the data warehouse system. Some of the typical data warehouse queries are noted to be generated online via analytical processing or by the data mining software components, which are noted to be showing some complex structure and would be critical in addressing the large number of rows that would be underlying on a database system (Inmon et al., 2019).
Data Warehouse Database
This system would pull together different data types from different sources within a particular organization for analysis and reporting purposes. The report originating from the complex data warehouse is critical in business decision-making as this would be analyzed as per historical data considerations and thus extracting some of the critical insights from the data analyzed.
Data Warehouse Bus Architecture
This component is the different sets of tightly integrated data marts that typically draw their “power” from particular standard sets of the confirmed facts and dimensions, for which the dimension table consist of the textual day, which would decode the identifier that would be linked with the fact’s tables (Inmon et al., 2019).
Data transformation is changing data formats, values, and values. There are different forms of data transformations that are needed for data preparation in the warehouse that, includes,
The extraction process is extracting the data from a source system and taking it to the staging area.
Loading is the following transformation process involving loading large data sets, which should be optimized.
Transformations are another one that involves the cleansing, mapping, and transformation of the different raw data that have been extracted in their original, not usable forms (Chi, 2017).
Big Data
Big data is considered to be substantial data sets that would be analyzed computationally, thus revealing the various das patterns, associations, and trends mainly when relating to human interaction and behaviors. Big data is understood to be something huge that needs a larger space for storage and has particular traits of being structured and unstructured data formats. Different types of big data analytics include diagnostic, predictive, descriptive, and prescriptive data analytics. Extensive data collections have different traits linked with them, such as value, volume, veracity, velocity, and variety, which are noted to be necessary during the data analysis. The different types of big data are grouped into different categories, such as external and internal data, genomic data, structured and unstructured data, open data, and many more (Shi, 2022).
An example of big data that I have ever seen its use personally or professionally is for the big social media company by the name Facebook which is noted to have large junks of data and thus making the company heavily invested in the data infrastructures such as the data warehouse. The company does not process data at their cost and shares the information with other companies who might probably have the needs as the company works by selling the data. In most cases, the company typically gets adverts from other companies that have shared their contacts with the company and thus can reach out to the target group of clients. This information is put on the social media platforms under some terms and conditions agreed upon, thus being able to sell and reach out to the different target parties of interest (Schmarzo, 2015).
Big data is placing different demands on organizations and data management technology, whereby some of these ways of influence include the following.
Extensive data management usually involves putting policies, people, and technologies together, thus ensuring that there are security, quality, and accuracy features that help push organizations into implementing the new technologies and the idea of thinking wider (Marr, 2015).
Another demand is on deriving insights from different information, thus making it better, timely, fact-based, and more intelligent, as this will help influence the growth rate of society on technology use and handling of big data and other related platforms (Marr, 2015).
Another view is about implementing new practices, which would involve different team members from different organization departments, and this helps to create greater access and identity management controls that would have even the audit trails and protection of the sensitive data sets (Marr, 2015).
Green Computing
Green computing is an approach that is desired towards having the protection of the environment from any hazardous effects that would be related to computer use. Generally, it advocates having a healthy environment that would raise all the issues relating to manufacturing, computer use, and the disposal of the underlying parts of the computers being disposed of. In the current world, most organizations have developed better ways of disposing of their used computers without altering the environment in which the computers are being disposed of, as the methods of disposal are environmentally friendly (Vikram, 2015).
Different approaches would be used to ensure that an organization’s data center always remains green, whereby these strategies include the following.
Having reduction of the power usages
This strategy would involve measures that would help an organization reduce variable costs per the data Center considerations relating to power usage. One should understand that the cost of power usage continues to increase at an alarming rate which means that within about five years to come, there will be a drastic increase in the power cost; thus, there needs measures to be put in that would help reduce these costs. It is critical for any data Centre to ensure that it puts in measures to reduce the energy consumed by its equipment being used. This means that the managers in these departments must prioritize the emanation of service future demands, reduce the work consolidation, and ensure the old model servers with new model servers (Kasemsap, 2018).
Conducting Regular energy audits
This is considered the strategy that involves having the baseline audit, which should be geared towards having a real-time assessment of energy consumption and also giving some of the measures to the start of a green computing zone. One should not only provide a benchmark for carrying out future assessments but also provide essential information for long-term planning activities. For that case, it would involve carrying out individualized assessments of all the systems under consideration and thus helping pinpoint some of the energy inefficiencies that would be there in the infrastructure under consideration. On reaching out to this point, the Audit will have provided some detailed assessment information relating to the usage of energy, the energy inefficiencies recorded, and some of the drawbacks with their remedies as per the considerations of the environment (Sabban, 2021).
Conclusion
The various trends in the data warehousing are firstly, column-based storage involves storing different data types in column format, thus helping the business conduct advanced business analytics efficiently.The use of Managed Services is another one that is noted to be the type of higher-level service for which, in instances having challenges, the cases are handled by the cloud.Lastly is the Data Marts for the Production Lines, whereby the data marts are noted to provide a solution by way of having the summarized data sets from a particular business unit under consideration (Chi, 2017).
In addition, based on the five traits of big data, such as variety, value, and many more, are considered to be relevant for a business organization. Because big data are always expensive to store, most firms are noted to have the considerations of the day having a quality value that would help generate revenue and sustain company operations and activities (Marr, 2015).
Having utilization of materials that are green Centre friendlyin this case, organizations are encouraged to make changes to the data Centre they could be having but not limited to, dormant servers that could have been shut down, LED transitioning, air compressor upgrades, and the shredding of papers. This would be strengthened by using the locally available alternatives of power servers and having the construction of a data Centre which would be a cold environment naturally. One should ensure that materials used in energy production have very minimal percentages of carbon footprints, as this would be suitable for our environment (Sabban, 2021).
References
Chi, T. (2017). Build information system pyramid: Ecology of data warehouse second edition.
Inmon, W., Linstedt, D., &Levins, M. (2019). The data warehouse/Operational environment interface. Data Architecture, 219-224. https://doi.org/10.1016/b978-0-12-816916-2.00028-0
Kasemsap, K. (2018). Cloud computing, green computing, and green ICT. Advances in Business Information Systems and Analytics, 28-50. https://doi.org/10.4018/978-1-5225-3038-1.ch002
Khan, A. (2019). Data warehousing 101: Concepts and implementation. iUniverse.
Marr, B. (2015). Big data: Using SMART big data, analytics and metrics to make better decisions and improve performance. John Wiley & Sons.
Sabban, A. (2021). Green computing technologies and computing industry in 2021. BoD – Books on Demand.
Schmarzo, B. (2015). Big data MBA: Driving business strategies with data science. John Wiley & Sons.
Shi, Y. (2022). Big data and big data analytics. Advances in Big Data Analytics, 3-21. https://doi.org/10.1007/978-981-16-3607-3_1
Vikram, S. (2015). Green computing. 2015 International Conference on Green Computing and Internet of Things (ICGCIoT). https://doi.org/10.1109/icgciot.2015.7380566