You are on page 1of 19

Module 6

Introduction to Big Data


and Data Analytics
Module 6: Introduction to Big Data and Data Analytics

6.1: KTP 1 Introduction to Big Data


6.2: KTP 2 Introduction to Data Analytics
6.3: Summary
6.4: References

6.2 KTP 2: Introduction to Data Analytics

6.2.1 Overview
With rapid development of the Internet, the world is entering the era of big data. Big data not
only brings profound changes to people's work and life, but also provides a powerful tool for
modernizing national governance capacity. It serves as a driving force and key reference for social
management and governance.

The application of big data offers similar opportunities for SAIs. Auditing assisted by big data
analytics, an all-data model targeting the audited entities, is changing traditional audit mentality
and methodology, which has attracted extensive attention of the SAIs. SAIs have started wider
and larger-scale data collection and application, and explored continuously the use of new
techniques and methods. Within the INTOSAI framework, SAIs are making unremitting efforts in
sharing knowledge and experience about big data, to promote national and global good
governance and support global sustainable development. INTOSAI has also paid keen attention
to big data analytics.

In this session, we will discuss how data analytics help auditors. A common model of performing
data analytics will be introduced and some examples of audit practice using respective data
analysis technologies will be presented. We end with a discussion on the opportunities and
challenges of data analytics we may face within the SAI in the era of big data.

6.2.2 How can data analytics help auditors?


Basically, data analytics is defined as the process of inspecting, cleaning, transforming, and
modeling data with the goal of highlighting useful information, suggesting conclusions, and
supporting decision making. It is an analytical process by which insights are extracted from
operational, financial, and other forms of electronic data internal or external to the organization.

Module
6 Introduction to Big Data and Data Analytics 1
Since all organizations are impacted by the big data in various forms, it is nearly impossible to
conduct an effective audit without using data analytics. The use of data analytics allows auditors
to view high level organizational operations and drill down into the data. It can be used
throughout all phases of an audit and can also be applied in different kinds of audit. Therefore,
it is important for auditors to realize that the use of data analytics is not limited to the scope and
activities associated with IT audit alone.

As we all know, the scale and scope of data available for our work as oversight organizations
continues to increase at a rapid pace. This has been driven by the increasing power and
availability of data analytic technology. Obtaining access to data, analyzing data, and developing
insights will continue to be an essential part of the work that SAIs do - both now and in the future.
There are a number of opportunities that big data and enhanced data analytics can bring to the
audit community. For example, there is the opportunity to improve the transparency, availability,
accuracy and usability of government data. This means having access to not just “big data”, but
also to “good data”. Further, by leveraging evolving technologies, tools, and techniques in the
area of data analytics, SAIs can greatly enhance the impact of their work. These approaches can
give SAIs the opportunity to further strengthen audit findings and conclusions. They can also
highlight potential new solutions to stop problems before they occur. This might include helping
to prevent fraud and waste in government programs before the money is even spent.

A number of specific analytical techniques have been proven highly effective in analyzing data
for audit purposes. For example, calculation of statistical parameters (e.g., averages, standard
deviations, highest and lowest values) to identify outlying transactions, classification to find
patterns and associations among groups of data elements, stratification of numeric values to
identify unusual (i.e., excessively high or low) values, correlating different data sources to
identify inappropriately matching values in disparate systems, summing of numeric values to
check control totals that may have errors and validating data entry dates to identify postings or
data entry times that are inappropriate or suspicious, using cluster analysis technology to
determine the audit focus, etc.

Data analytics can be used throughout a typical audit cycle. While individual audit cycle
definitions and steps may vary, the following breakdown provides some of the ways data
analytics can be employed during various stages in an audit cycle.

 Planning. Data analytics can be greatly effective in identifying data-driven indicators of risk
or emerging risk in an organization. This can help auditors define and create an audit plan
that focuses on the areas of highest concern. The audit activity should consider prioritizing
the use of data analytics for risk assessment during the audit planning stage, where the data

Module
6 Introduction to Big Data and Data Analytics 2
is available, and where this approach is applicable.
 Preparation. Data access and preparation can be a challenging step within the audit process.
Requests to IT departments can take weeks and the resulting data can often be incomplete
or incorrect, making for an inefficient process. By using data analysis technology during the
audit preparation phase, many of these delays can be avoided. Auditors skilled in the use of
data analytics can source the data required for the audit engagement, do data integrity and
validity checks. This will provide audit teams with streamlined access to reliable data sets
or even automated access to multiple data sources to allow for quick and efficient analysis
of data. Data should be housed in a centralized repository allowing the audit team to analyze
data sets according to their authorization and need for access.
 Execution. Traditionally, auditors have relied on techniques such as sampling or spot checks
to deal with ever increasing amounts of data. These techniques may be ineffective at
uncovering anomalies and indicators of failed or inefficient internal controls. To improve
effectiveness in the search for errors and unusual transactions, audit teams can use data
analytics to analyze entire data populations. Once initial analysis is done, efforts can be
focused on areas where exceptions were found, making more efficient use of audit
resources. The repetitive use of analytic scripts focusing on specific kinds of frauds increases
overall efficiency and allows for greater insight into high risk areas. Results and scripts
should be stored in a centralized repository allowing audit team members to review findings
and access and re-use analytic procedures.
 Quality assurance. The analytic routines and the results they generate should be included
in the quality review. This helps ensure that the conclusions drawn from using data analytics
can be relied on and that any mistakes in the query are identified and corrected or that
conclusions that were drawn from those results are not erroneous.

6.2.3 Performing data analytics


Based on the audit practice, a common set of procedures of performing data analytics has been
formed as the following seven steps: data requirements, data collection, data processing, create
intermediate table, choosing key area, data analysis and validate the evidence (see Figure 6.5).

Module
6 Introduction to Big Data and Data Analytics 3
Figure 6.5 Seven steps of data analytics

Step 1. Data requirements. Technical documents of relevant systems of the audited entity should
be collected according to the audit objective in order to achieve detailed understanding of the
database and data situation. One of the most important technical documents is data flow
diagram (see Figure 6.6), which shows what kind of information will be input to and output from
the system, how the data will advance through the system, and where the data will be stored. It
helps us to understand the structure of the system and the relationship among systems. Another
one is data dictionary (see Figure 6.7), which defines the basic organization of a database. It is a
collection of tables with metadata, which contains a list of all files in the database, the number
of records in each file, and the names and types of each field. It helps us to understand the data
structure. After we go through these technical documents, based on the audit objective, we can
propose an audit data requirement to ask the audited entity to prepare the data accordingly. The
data requirements could include complete databases, selected tables out of the databases,
selected data fields of tables in the databases or data pertaining to specific criteria/ condition
for a particular period, location, class etc. Depending on the data size, this may be obtained in
flat file or dump file formats.

Module
6 Introduction to Big Data and Data Analytics 4
Figure 6.6 Example of Data Flow Diagram

Figure 6.7 Example of Data Dictionary

Step 2. Data collection. During this step, we have to export the required data from the database
of the audited entity and import it into the audit data environment. If read-only access on the
system is granted by the audited entity, then data stored in tables relevant for the audit exercise
can be extracted by querying the database if the skill set exists. Otherwise, the entity can be
requested for providing a copy of the relevant source data. Auditors may have to create similar
environment (compatible versions of common database applications, operating systems,
hardware etc.) as at the audited entity to import/analyze data from the copy of extracted data
dumps. Audit software/ Extract, Transform and Load (ETL) tools could be used for importing data
from varied database platforms.

Module
6 Introduction to Big Data and Data Analytics 5
Step 3. Data processing. Good quality data which is clean, complete and devoid of errors is
essential for good analysis. However, as we have discussed in the previous session, data will
never be perfect. There may be duplicates, missing values or erroneous types. Thus, cleaning the
data is important, especially when importing data from different source files. Consequently,
auditors need to spend time normalizing and aggregating the information to make sure the
format is consistent for all data, thus helping to ensure the accuracy of analysis results. And
auditors have to identify incomplete, incorrect, inaccurate or irrelevant parts of the data and
then replace, modify, or filter out the inaccurate or corrupt data. Furthermore, auditors may also
be required to convert data from one form to another to facilitate better reading and analysis.
After all these tasks have been settled, auditors have to verify the data to make sure the integrity
(i.e. - that some data is not lost). Basically, we choose some key indicators, such as numbers of
the records or sum of numeric values, etc, and compare with the source database or financial
statements of the audited entity.

Step 4. Create intermediate table. Nowadays, most of the databases use a relational format to
organize the data (see Figure 6.8). It's more economy and efficient. We do not have to record all
the relevant information in each table, just include the unique variables and use the key variables
to link all these tables, forming the complete information. But for audit, the situation is
completely different. Since the information we want to analyze is dispersed in several tables, we
have to reproduce the data and include all the components into one or several intermediate
tables to form the basis of data analysis.

Figure 6.8 Example of relational database

Module
6 Introduction to Big Data and Data Analytics 6
For example (see Figure 6.9), there are three tables in the database. Table “books” use “book_id”
as the primary key and records the information of book name, while table “author” use
“author_id” as the primary key and records the information of author name. These two tables
are connected by the table “books_has_author”, which records the corresponding relations
between “book_id” and “author_id”. If we want to list the author name in a descending order
by the number of books they written, we have to create an intermediate table first, including
“author_id”, “author_name” and “book_id”.

Figure 6.9 Example of creating intermediate table

Step 5. Choosing key areas. We conduct general analysis according to the audit objective, find
the significance or weakness, and determine the key areas to form the relatively detailed data
analysis. For example, if we are going to audit a commercial bank, and one of the audit objectives
is to detect fraud in improper loan extension, we may select some key indicators (e.g. which
branch has the highest frequency of loan extension and which has the highest non-performing
loan ratio) to perform general analysis and find out which branch would be the target.

Step 6. Data analysis. The data after due preparation is analyzed to derive insights using various
analytic approaches. By digging in the data, one might gain very interesting insights, which can
guide auditing. These insights can be historical, real time, or predictive and can also be risk-
focused (e.g., controls effectiveness, fraud, waste, abuse, policy/regulatory noncompliance) or
performance- focused (e.g., increased sales, decreased costs, improved profitability). The
following approaches can be used in data analytics:

 Descriptive analytics tries to answer “what has happened”. It provides an understanding of


the past transactions that occurred in the organization and involves aggregation of
individual transactions, thus provides meaning and context to the individual transactions in
a larger perspective. It involves summarization of data through numerical or visual
descriptions.

Module
6 Introduction to Big Data and Data Analytics 7
 Diagnostic analytics is an advanced form of descriptive analytics and tries to answer the
question “why did it happen” or “how did it happen”. It involves an understanding of the
relationship between data sets and identification of specific transactions/ transaction sets
along with their behavior and underlying reasons. Drill down and statistical techniques like
correlation assist in this endeavor to understand the causes of various events.
 Predictive analytics tries to predict, “what will happen”, “when will it happen”, “where will
it happen”, based on historical data. Various forecasting and estimation techniques can be
used to predict, to a certain extent, the future outcome of an activity.
 Prescriptive analytics takes over from predictive analytics and allows the auditor to
‘prescribe’ a range of possible actions as inputs such that outputs in future can be altered
to the desired solution. In prescriptive analytics, multiple future scenarios can be identified
based on different input interventions.

Many types of analytics techniques could be applied to extract insights, such as

 Rule-based: filter noncompliance with laws, regulations and rules,


 Anomaly: detect individual and aggregated abnormal patterns versus peer group,
 Predictive: assess against known fraud cases,
 Network: discover knowledge with associative link analysis, etc.

Step 7. Validate the evidence. Given the result from the analysis phase, auditors have to track
down the trace to validate the evidence. Although data can tell us some insights, but not
everything data told us is true. We have to validate the evidence. The situation could be
completely opposite because of some information that didn't included in the audit data
environment. That’s why we should perform this step.

The fourth step divides the process into two different parts. The purpose of the first three steps
is to acquire data, while the last three steps is to analyze the data.

6.2.4 Examples of Audit Practice


Even as more data is being made available, we need to ensure that SAI has the capability to
analyze that data. This means having the right people, skills, technology, and techniques to
conduct insightful analysis of large and complex data sets. Increasingly sophisticated analysis
may be required to leverage new sources of data. Moreover, with the proper application of big
data analytics, SAIs can be both more efficient and more effective in the way we do audit work.
Here are some examples of audit practice assisted by employing different technologies.

Case 1: Anti-money laundering by using SQL

Rich data sources make the identification of potential fraudulent activities easier. Because it is

Module
6 Introduction to Big Data and Data Analytics 8
difficult for fraudster to change all upstream non-financial transactions to cover up financial
statement fraud. Thus, trade based money laundering can be detected by correlating and
comparing among different data sources. First, obtain the data of bank account statements from
the audited banks. Auditors can find out bank accounts with the following characteristics as the
suspicious targets by using Structured Query Language (SQL) analysis (See Figure 6.10).

 Transaction period is quite short, always less than one year.


 Transaction types are quite simple, only receiving money from different bank accounts and
transferring almost the same amount to another, and the ending balance of each day is very
small.
 The frequency and the total amount of the transaction are extremely high.
 Most of the transaction was initiated via internet channel, and a same IP address may
suggest that a group of accounts was controlled by someone.

Figure 6.10 Select suspicious bank accounts

After the suspicious bank accounts were selected, auditors may refer to the following data
sources to find out whether these companies are committed to money laundering.

 Companies’ registry information. Most of these companies are small foreign trade
companies, and do not have too much capital. Many of the registration certificates were
expired. Some of the companies have the same address and corporate executives may
indicate the presence of a controller.

Module
6 Introduction to Big Data and Data Analytics 9
 Taxation information. Most of these companies didn’t pay any tax during their life span.
 Foreign trade information. There would be a huge gap between the transaction amount of
their bank accounts and the actual value of shipping goods.

Case 2: Land resources audit by using GIS

With the continuous development of IT application, such as geographic information system (GIS)
and global positioning system (GPS) technology, auditors can conduct land and water resources
audit, focusing on lake reclamation, sea reclamation, occupation of farmland and land
requisition, etc. Basically, we use ArcGIS technology to compare the geographic data from
different sources and to find out whether there would be an overlap among these source data.

For example, by stacking graphic layers, auditors may found suspicious overlapped areas
between farmland data and forestry data. And then compare this overlapped areas with the
arable land areas which received subsidies from financial and agriculture authorities. If there is
a match, it might indicate the presence of subsidy fraud. (see Figure 6.11)

Figure 6.11 Overlap between farmland data and forestry data

Another good example of the application of ArcGIS technology is the verification of the
authenticity in land requisition and demolishing compensation. Since the related physical
evidence disappeared after the houses and their appurtenances are removed, it is difficult for
auditors to verify the demolition compensation information provided by the audited entities.
But, the application of ArcGIS technology can help us solve this problem in the audit of land
requisition and demolition compensation. First, we use the GPS data to locate the target areas.
And then, we can review the satellite images of the target location through the Google Earth
software to verify whether the removed house, which received compensation, is an illegal

Module
6 Introduction to Big Data and Data Analytics 10
construction (see Figure 6.12).

Figure 6.12 Review the satellite images

Case 3. Social network analysis (SNA) by using graph database

In today’s digital environment, social networks have become prominent as a result of a growing
number of social communities: Facebook, LinkedIn, and in China, Wechat, are probably the most
prominent more general communities. Social communities have also increased the availability
of network data, which in turn has led to more studies on social network analysis.

Social network analysis is the process of investigating social structures through the use of
network data and graph database. If you want to get a deepen understanding about the
relationships among companies or persons, it’s a good choice to perform this kind of data
analysis. First, we have to understand what graph database is. Generally speaking, a graph
database is a database that uses graph structures for semantic queries with nodes, edges and
properties to represent and store data.

A simple example, there are four tables (see Table 6.4.1-6.4.4) in a relational database, including
the information of company, relationship between persons, project and government
employment.

Company_id Company_name CEO_id CEO_name Parent_company_id Share


C100001 Company A R000001 Argon C100001 100%
C100002 Company B R000002 Angel C100001 80%
C100003 Company C R000003 Allen C100002 75%

Module
6 Introduction to Big Data and Data Analytics 11
C100004 Company D R000004 Jerry C100003 55%

Table 6.4.1 Information of the company

Project_id Project_name Prime_contractor Approval_department


Subway Construction
P00001 Company D Government Office A
Project
P00002 School Construction Project Company A Government Office B
Table 6.4.2 Information of the project

Personal_id_1 Personal_name_1 Relationship Personal_id_2 Personal_name_2


S00001 Tom Parents K00001 Joshon
S00002 Lucy Couple K00002 Jack
S00003 Lily Parents K00003 Tang
S00004 Jim Couple K00004 Jessy
S00005 Nancy Parents K00005 Donald
S00006 Rock Couple K00006 Angel
S00007 Kaka Parents K00007 Micky
Table 6.4.3 Information of the relationship

Office_name Employee_id Employee_name


Government Office A S00001 Tom
Government Office A S00002 Lucy
Government Office A S00003 Lily
Government Office A S00004 Jim
Government Office A S00005 Nancy
Government Office A S00006 Rock
Government Office A S00007 Kaka
Table 6.4.4 Information of the government employment

If we want to study the network among these entities based on this relational database, we have
to correlate these four tables by using SQL and retrieve the network structure. But, it would take
a lot of time and the results would not be quite convenient to extract insight. If we perform the
SNA based on the graph database, it would be much easier. For example, we are interested in

Module
6 Introduction to Big Data and Data Analytics 12
the Subway Construction Project, we can simply pick up the relevant entities by one single
command. Based on the results (see Figure 6.13), auditors can easily find out that there is a
relationship between Rock (employee of the government office A ) and Angel (CEO of Company
B), while Company B controls Company D indirectly through Company C. Since Company D is the
prime contractor of the Subway Construction Project, which is approved and financed by
Government Office A, one may wondering that whether Rock is attempting to perpetrate fraud
and help his wife’s company to get the contract.

Figure 6.13 Social network analysis

Case 4: Social listening by using text analytics

Social listening has become one of the most popular new analysis tools, as in this big data era
unstructured and specifically text data has become omnipresent in online communities, social
media like Facebook and Twitter. These data are accessible to multiple parties and there are also
some standard tools available, such as Radian6 for social media monitoring. Text analytics is used
for the analysis of social data and it could adopted by the SAIs to perform sentimental analysis,
review audit reports, track the audit recommendations and find out public correspondence.

As a first step, auditors have to decide what text they want to analyze. This starts with a clear
objective. For example, if the SAI wants to gain insights from public reviews of a specific audit
report, raw data of reviews should be selected. However, if the SAI wants to measure the
sentiment about the performance of a specific policy, data on what is being written about the
policy in multiple social media should be collected. These examples already imply that the

Module
6 Introduction to Big Data and Data Analytics 13
collection of data is not an obvious step. There are issues surrounding where to collect data, the
included time period, and which data will be collected for. If the data are being collected a
database for text data should be set up, which can be used for analysis.

For example (see Figure 6.14), text collected via webscraping has been topic modelled by using
R, and then built as an interactive dashboard. It allows the auditors to look at the different topics
identified, and how closely related the topics are. Topics were identified by the program, i.e.
unsupervised machine learning.

Figure 6.14 Text analytics

6.2.5 Opportunities and Challenges

6.2.5.1 Four levels of data analytics within the SAI


The effective use of data analytics is not only a technical innovation but also a revolution in audit
mentality, system and methodology. Competency in understanding key business processes of
audit entities and being able to analyze and interpret the data that reflects those activities is a
requirement. SAIs should be able to assess the levels of sophistication or capability within their
organization to ensure that it aligns with their strategic development plan.

 Level 1 - Basic Use of Data Analytics. This level is characterized by the basic use of data
analytics to perform queries and analyze data in support of a specific audit objective.
Activities typically include statistical analysis, classifications, or summarization of data.

Module
6 Introduction to Big Data and Data Analytics 14
Usage is usually ad hoc by a limited number of audit staff and may be unplanned. This use
of data analytics helps auditors rapidly gain insight into risks and control issues in a given
audit area. However, the use of data analytics can be better integrated into audit procedures
and at different stages in the audit cycle. This requires an investment in changing audit
processes-educating audit staff in the concepts of data analysis and the technology itself.
 Level 2 - Applied analytics. Usage at this level builds on the basic level and is characterized
by data analytics being fully integrated into targeted audit processes. Both audit planning
and the design of an audit program take data analysis into account - effectively creating a
“data analytics assisted audit”. Within this more structured approach to using data analytics,
comprehensive suites of analysis models may be created, reviewed, and subject to quality
assurance procedures. At this level, data analytics begins to transform the audit process,
providing substantial improvements in efficiency, levels of assurance, and the overall value
of findings.
 Level 3 - Managed analytics. The Managed level is the logical evolution from the applied
stage. This increased level of sophistication is in response to some of the challenges inherent
in a more widespread, decentralized use of data analytics. In this more organized and
controlled approach to data analysis, data, models, results, audit procedures, and
documentation are stored in a centralized and structured repository. Access to and use of
this content is aligned with key audit procedures and is controlled and secure. This makes it
more practical for nontechnical auditors to access and use the results of data analysis. Once
data analytics is managed centrally, audit teams can benefit by increased efficiency through
the sharing of data analysis work (data, models, and results). Data analytics is repeatable,
sustainable, and easier to maintain the overall quality and consistency of analytic work.
 Level 4 - Automated analytics. The automated level builds on the capabilities established
to support managed analytics. The building blocks established at the previous levels form
the basis for increased automation of analytic processes and, where appropriate, the
implementation of real-time auditing. Data access protocols have been established for the
automated running of data analysis. Comprehensive suites of models have been developed,
tested, and are available in a central, controlled audit environment.

6.2.5.2 Challenges
So far in this module we have mainly discussed the value opportunities of big data and big data
analytics. Indeed, these value opportunities can be considerable. However, there are still some
challenges we have to take into consideration in the era of big data.

 Audit Evidence Risks Caused by New Techniques and Audit Quality Control

While enjoying the benefits of new techniques, we have to realize that there are various kinds

Module
6 Introduction to Big Data and Data Analytics 15
of risks behind new technology in some aspects, such as evidence validity, integrity and
interpretability. Therefore, when applying innovative technologies in audit work, we need to
keep a clear understanding and remain prudent. For instance, in data analysis, the authenticity
is the basis of data value, and also one of its congenital defects. Similarly, data completeness is
one congenital defect of big data as well.

Currently, big data still seems like a lonely island. There is no departments can obtain sufficient
data to comply with the requirements of both width and depth. Also, in the aspect of causality
and correlation, big data analysis depends on machines, but the machine can only provide the
correlation relationship among data, the causality needs people’s thinking and judgment. In the
process of data interpretation, if we merely consider the correlation and ignore the causality, it
will probably lead to a wrong or even dangerous conclusion. Therefore, how to control the risks
caused by big data analytics and how to enhance the quality of the audit is a very critical
challenge which may even influence the future trend of data analytics assisted audit.

 Establishment of Audit data center

The design of the audit data center shall have the characteristics of high flexibility, high
performance, fault tolerance, scalability and low operation cost. In addition, it shall meet the
technical characteristics of big data with respect to volume, velocity, variety and veracity. With
respect to the data sources, the audit data center covers the operational and managerial data of
the audited entities, and external data from the internet; with respect to the function, it involves
the data sorting, storage, inquiry, comprehensive analysis and service; with respect to the
procedure, it collects the raw data from the audited entities and internet, then stores and
processes the raw data in different storage formats within the domestic database; and finally,
the analysis service will be provided for the standardized data through the parallel database.

At present, the big data open-source technologies have taken the mainstream of the market,
and the popular architectures, such as Hadoop, Spark, Storm, have their respective
characteristics. But the competition of the open-source technologies also results in the
hollowing of the big data platform criteria in the current time, and brings certain puzzles in the
selection of the core technology for big data audit architecture.

 Standardization of data source

Audit deals with the information from diverse sectors. In many cases, one entity may have
several separate systems to perform different tasks. Hence, data submitted by different
departments often have some differences due to the diversity of data sources. This situation
presents challenges to the gathering, conversion and storage techniques for auditors. In audit
practice, source data may come from large databases, desktop databases and data in different

Module
6 Introduction to Big Data and Data Analytics 16
formats or versions, leading to inconsistency. Thus, during the data preparation phase, auditors
always spend a lot of time to find the data differences and analyze the reason, delaying the entire
audit process. Moreover, it is hard to verify data as the business processes are complex.
Therefore, it is one of the problems to be solved urgently to make the data source accessible in
a standardized digital format.

6.3 Summary

In this session, we start with the question “how can data analytics help auditors?”, it should be
viewed as an enabling technology that can deliver great value to SAIs in the areas of improving
efficiency, effectiveness, and levels of assurance that can be provided. Its impact is not limited
solely to reducing the amount of time to conduct an audit but also to aid in the detection of
errors, control breaches, inefficiencies, or indicators of fraud. By employing data analytics, it
must be recognized that there will be changes to what processes and activities need to be carried
out, and what technology or technologies can be leveraged to gain the desired insights. Getting
the right data, understanding what the analytics are indicating, and following up on the results
of analysis can be a significant task.

6.4 References

Chen Wei & Smieliauskas Waly (2016). Opportunities, Challenges and Methods of Electric Data
Auditing in Big Data Environments. Journal of Computer Science, 43(1), 8-13.

Davenport T., & Harris J. (2007). Competing on analytics - The new science of winning. Boston,
MA: Harvard Business School Press.

Frederick Gallegos, Sandra Senft, Daniel P. Manson & Carol Gonzales (2004). Information
Technology Control and Audit (Second edition). CRC Press LLC.

INTOSAI. ISSAI 5300 – Guidelines on IT Audit.

Jack J. Champlain (2003). Auditing Information Systems (Second Edition). John Wiley & Sons.

Nair R., & Narayayan, A. (2012). Getting Results from Big Data: A Capabilities- Driven Approach
to the Strategic Use of Unstructured Information. Booz&Hamilton.

Office of the Comptroller and Auditor General of India (2017). Guidelines on Data Analytics.

Module
6 Introduction to Big Data and Data Analytics 17
Peter C. Verhoef, Edwin Kooge & Natasha Walk (2016). Creating value with big data analytics:
Making smarter marketing decisions. Routledge.

The Institute of Internal Auditors (2011). Global technology audit guide (GTAG): Data analysis
technologies.

The Institute of Internal Auditors (2017). Global technology audit guide (GTAG): Understanding
and Auditing Big Data.

U.S. Government Accountability Office (2013). Data Analytics for Oversight & Law Enforcement.

U.S. Government Accountability Office (2016). Data and Analytics Innovation: Emerging
Opportunities and Challenges.

U.S. Government Accountability Office (2017).Data Analytics to Address Fraud and Improper
Payments.

Viktor Mayer-Schönberger & Kenneth Cukier (2012). Big Data: A Revolution That Transforms
How we Work, Live, and Think. Houghton Mifflin Harcourt.

Working Group on IT Audit (2014). IDI Handbook on IT Audit for Supreme Audit Institutions.

Module
6 Introduction to Big Data and Data Analytics 18

You might also like