Chapter 1. Re-establishment of the Framework for Using Big Data to Prevent Crimes
In today's societies suffering from uncertainties about the future and high risks of various types, big data improves our ability to predict the future by enhancing our capability to analyze and respond to the reality, and allows us to seek ways to make a better future through objective and scientific approaches based on data. In particular, in the public sector, big data is expected to function as a cutting edge tool for scientific analysis for managing possibly nation-threatening risks, and one of the core sources for formation of national policies designed to prepare ourselves for the uncertain and high-risk future.
However, big data, which qualifies as a new form of knowledge in terms of both form and technology, poses both the possibility of creating infinite values for the future, and the risks of which nature we do not yet know. Presently, big data's potential for creating technological values open up an infinity of possibilities for us, and the future outcomes and risks posed by such potential also remains uncertain. There have been clear issues raised against big data, however, which strongly warrant the need to transform the technological and legal framework. The big data environment involves data generation and utilization which surpass the boundaries of the existing paradigm, which led to criticisms for possible infringement on people's right to information and their privacy.
Despite such uncertainties and risks posed by big data, big data is being used by countries across the world to strengthen their preventive administrationsystems so as to secure the capability to address national issues in a reasonable and preemptive manner. In particular, use of big data for crime prevention, investigation, and other social safety purposes is emerging as a major thesis in multiple countries.
As such, under the current circumstances where production of big data and crime prevention based on prediction using the big data are gaining significance, this Study seeks to propose a way to utilize big data technologies to prevent crimes, while minimizing the risks against personal information, human rights, and privacy inherent in big data technologies. More specifically, this Study aims to provide and overview of rapid changes and developments of information and communication technologies and changes in crimes brought on by the changes in social structures, review the necessity and limitations of using crime prevention policies utilizing big data, and lay the foundation to discuss legislative approaches which considers the risk of privacy invasion inherent in big data.
2. Description and Method
This Study is a part of a two-year project: the study conducted in the first year carried out a sweeping overview on big data, and reviewed discussions in and outside of South Korea regarding people's right to information in terms of the relationship between big data and protection of personal information, followed by a review and definition of the potential of big data for crime prevention purposes, and the nature of the right to be forgotten as an interest to be protected under the law and the scope of such protection from a criminal policy perspective.
The purpose of this Study, conducted in the 2nd year of the project period, is to carry out a comparative analysis of various legal systems utilizing big data in various countries, analyze the outcome of the use of crime prevention systems overseas, discuss the technical limitations of crime prevention systems based on big data, and propose a model framework. In addition, a survey was carried out with a view to identifying the perception of the general public and experts in the relevant areas on the possibility of using big data for crime prevention purposes, and in-depth interviews with big data experts were conducted to derive issues to be discussed with regard to the relationship between the development of big data technologies and crime prevention systems. Based on the findings from such survey and interviews, this Study seeks to propose possible legislative improvements.
3. Reorganization of the Framework for Use of Big Data for Crime Prevention Purposes
If we are to use big data to prevent crimes, the existing framework for data use and protection of personal information needs to be reorganized. In other words, to maintain a reasonable level of protection for information privacy, which is in technical conflict with big data, the frame of big data use must be redefined in terms of the management and assessment of risks posed by big data.
In most countries, data protection laws put forward consent as the prominent of the elements which renders certain data processing lawful. Requiring consent is a highly effective way to protect data in an environment where computer systems are not closely connected with each other. However, in an environment where data is collected and used based on big data, consent lost its relevance as a means of protecting data. On the contrary, requiring consent has led to highly unreasonable outcomes where such requirement creates opportunities to collect or utilize excessive personal information, thereby allowing invasion of information privacy while shifting the responsibility for such invasion to the data subjects. To address issue, new approaches to data subject's consent are being discussed, such as the balancing model, the replacement model, and the exemption area model.
As the traditional approaches for data protection came to be no longer suited to the goals for which they were designed, data risk management is facing the demand to shift its focus from the conventional control of data collection to the management of data use. Risk management based on data use in a big data environment involves assessing privacy risks at the time of using the data rather than privacy risks at the time of using the data. It has the potential to become the most effective way to protect individual citizens' privacy against risks posed by data use while allowing for realizing the value of the data.
However, there are some commentators who raise the criticism that reforming the legal framework may not be a sufficient solution for privacy protection in the future big data environments. That is, technical structures for data protection must be arranged as early as the design phase rather than the data use phase, thereby securing effective privacy protection from the very beginning. Technologies, rather than legal provisions, may be a more realistic and useful option to provide data subjects with meaningful confidence in privacy protection. Privacy by design and de-identification technologies are proposed in this context. Privacy by design refers to a technical standard for developing information systems based on privacy compliance. It is widely used in engineering and standard building as an essential element of privacy protection. Deidentification or anonymization refers to technically removing identifiability in big data environments. As free flow of information holds great relevance in big data environments, deidentified or anonymized information is considered as an important alternative to prevent privacy invasion while allowing for free use of data, in that processing and use of such information is not subject to legal restrictions.
Chapter 2 Comparative Analysis of Legal Systems Related with Big Data Use in Developed Countries
1. Direction of Legislation on Big Data Use
Both in and outside of South Korea, legal frameworks for protection of personal information or privacy is predicated on the fair information practice principle (FIPP). The big data environments, however, involves certain aspects where it is difficult to adhere to the FIPP, which forms the fundamental framework for privacy protection laws. To be specific, due to the nature of big data, new values not anticipated at the time of data collection may be discovered during subsequent analysis. As these new findings form the basis for secondary use of the data, the principles on data collection and use under the FIPP do not readily apply to such cases. These situations warrant the need to look into how developed countries reform their legal systems designed to protect personal information with regard to the use of big data, and how such reforms can be applied to systems designed to prevent crimes.
Legislation efforts in developed countries approach the issue of big data use from two perspectives: disclosure, and protection. In other words, developed countries enacted information disclosure laws and personal information protection laws as framework laws for promoting the use of big data, all the while preventing possible infringement of personal information.
On the basis formed by these two types of laws, each country is currently building a framework which establishes the rights and duties, autonomy and regulation, and authorities and responsibilities regarding personal information, under the ultimate goal of building better, safer, and more efficient big data environments
2. Personal Information Protection Laws
To protect personal information, the United States enacted the Privacy Act, the United Kingdom and Germany enacted the Data Protection Act and Bundesdatenschutzgesetz (Data Protection Act), respectively, and the European Union established the Data Protection Regulation. While the Privacy Act only applies to records managed by administrative bodies, the British and German laws apply to both public and private sectors. The difference between the UK and Germany is that while the former applies the same provisions to both sectors, the latter applies different provisions to the public sector than those applicable to the private sector.
In line with the original purpose of protecting personal information, each country clearly specifies the rights of data subjects in information protection laws. The rights of data subjects commonly protected across different countries include: right to access, right to request revision of inaccurate information, and right to prevent or remove infringement. The United Kingdom and Germany, which transposed EU directives and regulations into their national laws, have provisions providing for the right to delete and the right to refuse profiling. In addition, data processing managers are imposed obligations across all processes of data processing, for which they are responsible under the law. These laws seek to achieve meaningful protection by imposing more strict responsibilities in terms of data collection, use and processing on those responsible for managing information acquired based on consent, rather than data subjects who gave such consent. In order to strengthen the supervision on performance or violation of such obligations by data processing managers, the relevant laws provide for appointment of information protection supervisors (or members of information protection commission), who are responsible for evaluating compliance with information protection laws or intentional or negligent violation thereof, and imposing appropriate sanctions for such violation.
Legal systems for information protection in different countries provide for procedural means to object to infringement of a data subject's rights or refusal to comply with infringement claims. Such objections and remedies are indispensable for ensuring the effectiveness of the relevant rights. The United States, the United Kingdom, and the European Union provide for civil, criminal, and even administrative remedies, except for Germany which only allows for damages under the civil law.
3. Information Disclosure Laws
Information disclosure laws were enacted against varying backgrounds depending on countries. The United States enacted its information disclosure law to ensure the transparency and democratic accountability of the government to protect the people's right to know. The United Kingdom accepted the recommendations and directives of the European Council and the European Union. As for Germany, since the fundamental framework for the country's administrative system had been predicated on the confidentiality principle and limited disclosure of documents, the country had been the least active towards disclosing information. As a result, Germany adopted its information disclosure Act after the adoption of similar laws in other countries.
The information disclosure laws allow 'any person' or 'all persons' to request disclosure of information. 'That is, under these laws, anyone can request information held by any administrative body. Since disclosure of information may cause risks against individuals, institutions, and the national as a whole, the laws provide for certain exemptions to information disclosure. When providing for such exemptions to information disclosure, the United States and the United Kingdom introduced the distinction between absolute grounds for non-disclosure and relative grounds for non-disclosure. In case of an absolute ground for non-disclosure (full exemption), the administrative body has no obligation to comply with a request for information disclosure, nor does it have any obligation to inform the requester whether the requested information Exists. In case of a relative ground for non-disclosure, the matter is subject to a balancing test to see whether non-disclosure is more conducive to the public interest than disclosure, or whether the data has any value that warrants continued protection.
In addition, each country provides for objection/remedies against an administrative body's response to a disclosure request. In most cases, the requester may file an objection directly with the relevant government body, and a court may issue an information disclosure order. While in most countries the grounds for objection/remedies are not clearly specified and therefore open to interpretations based on the relevant situations, the United Kingdom provides for specific grounds for objection. In addition, the German law is characterized by allowing objections against subjective infringements.
Chapter 3. Analysis of Operation and Performance of Crime Prevention Systems Utilizing Big Data in Foreign Countries
This study made a comprehensive review on foreign countries’ crime prevention system utilizing big data science, centering around the U.S. and U.K. cases. Since the start of the new century, proactive efforts have been made to predict and prevent crime with notable positive outcomes. Some of the instances include Predictive Policing, SMART Policing, and DDACTS (hereinafter, these policing strategies are referred to as “Big Data Policing”), which have mostly originated from the U.S., expanding into U.K., Australia, Canada, and other developed countries. On the other hand, issues such as unreliable prediction and human rights violation (e.g., invasion of privacy) have been consistently raised. Each government has been taking measures to address these issues, one of which is to set up data governance. Below is the summary of this report.
Big Data Policing applies advanced (mostly quantitative) analytical techniques to identify and intervene high-risk places and people, such as hot spot analysis, regression analysis, data mining, near-repeat modeling, spatio-temporal analysis, geographic profiling, and risk terrain analysis. It must be noted, however, that it is more than a mere technical concept that mingles crime analysis with big data science. Rather, the Big Data Policing must be understood as a comprehensive policing strategy that inherits the partnership ideology of Community Policing and the systematic process of Problem Oriented Policing. As such, it has several key features:
- Strategic Approach to Goals
The primary mission of Big Data Policing is to improve quality of life for communities by controlling not only crime and disorder but also all types of neighborhood problems like order maintenance and fear of crime. Thus, all of the community problems need to be monitored and analyzed to figure out the priority. Once the priority is decided, it is recommended to establish specified action plans and develop a logic model to determine the criteria for measuring performance during the evaluation phase.
- Maximizing Data Utility
Making better use of data is the key criterion which distinguishes Big Data Policing from the other policing strategies, and thus the Big Data Policing requires collecting diverse data and applying adequate analysis techniques to discover problems and their causes as precise as possible. First of all, data collection should be carried out in an exact and speedy manner. Otherwise, it becomes difficult to make a precise analysis and on-time response. Also, data must include not only crime and incident records but also diverse information on risk factors and quality of life. That said, data integration among various organizations and accumulation of semi- or non-structural data need to be carried out properly. Lastly, each department should be able to apply suitable prediction techniques described in the main body of the report to diverse situations. For this purpose, some large-sized departments hire analysis experts. Small departments who cannot afford to hire experts are recommended to seek assistance from research institutes.
- Strategic Intervention
In order for the problem solving strategy and tactics to be effective and efficient, they need to be as narrow and deep as possible, and optimized for the characteristics of neighborhood, population, and individuals. Also, accountability should be ensured by assigning a person in charge. At the same time, organizational flexibility needs to be strengthened so that other organizations and departments can provide assistance. One of the most important elements required to maximize the effectiveness of intervention is the “situational awareness,” for which an effective communication system has to be constructed among all personnel from commanders to field officers.
- Ongoing Monitoring and Evaluation
There are two types of policy evaluations: process evaluation and outcome evaluation. Since the 2000’s, process evaluation has been gaining significance in that the expected outcome cannot be obtained without proper implementation of the planned process. Therefore, ongoing monitoring and evaluation needs to be made from the planning phase to the implementation of police intervention. For strict and objective evaluation, it is mportant to have researchers perform (quasi-) experimental study that compares experimental areas to control counterparts. Continuity is one of the main elements of evaluation. Continuous monitoring activities are required to detect any change in crime pattern or neighborhood environment caused by criminals’ reaction to police intervention, through which the cyclic process of Big Data Policing may be ensured.
- Collaboration and Outreach
Big Data Policing requires collaboration with neighborhood in all phases, which would be the only way to ensure the effectiveness of crime prevention and the legitimacy of policing. As a matter of fact, cooperation itself may improve the image of and trust in police, which tends to have a positive impact on the residents’ feeling of security and satisfaction regardless of actual status of crimes. Besides, collaborations with other government agencies, non-profit organizations, and civil entities are important to collect and share data, strengthen collective intelligence on big data strategy, and get useful feedback from each other. Finally, proactive efforts for outreach are to be made to consolidate the cooperation. For this, public relations and education for residents are necessary. And also, having representatives of other organizations participate in the strategic meeting for Big Data Policing might be another good way.
- Control Tower and Community
As seen in the example of SMART Policing Initiative, it is desirable to secure consistency and quality of the Big Data Policing program by constructing a Big Data Policing community. One of the most important measures is to invigorate the working groups comprised of experts and working-level staffs and let them develop, distribute, and provide training on best practices. Also, it is critical to have small departments get help from research institutions so that they can construct big data systems, collect and analyze data in a proper manner, and carry out strict evaluations. Further, setting up diverse meetings and visits among departments and holding web seminars could be another good measure. Finally, constructing control tower and community might be a great help to establish an effective big data governance.
2. Program Operation and Performance Analysis in Overseas Crime Prediction Programs
In western developed countries such as the U.S. and U.K., diverse programs to predict and intervene crime and criminals are operated via Big Data Policing. Some examples of crime prediction include PredPol, PILOT (Shreveport PD, U.S.), and Prospective Mapping (West Yorkshire PD, U.K.). While PredPol is the most popular commercial program, PILOT and Prospective Mapping are programs designed and developed by individual departments. London’s Ring of Steel and New York City’s Domain Awareness System are typical examples of real time surveillance and response with information from CCTVs and License Plate Reader. Besides, diverse programs are operated by non-police agencies such as HMRC. HRMC utilizes Connect to prevent and investigate tax evasion.
Compared to predicting crime, predicting criminals has a higher probability of error and faces harsher criticisms for possible privacy and human rights violations. Nevertheless, ongoing efforts have been made particularly in predicting re-offense of inmates and gang members. Examples include: re-offences prediction using behavioral instruments by Florida Department of Juvenile Justice, Random Forest Modeling by Philadelphia Adult Probation and Parole Department, OASys by Ministry of Justice of England, and Custom Notification by Chicago PD. In addition, many attempts to identify the residential area of serial offenders have been made using geographic profiling techniques.
Reactions to crime prevention programs utilizing big data do not consist entirely of praises for their achievements and expectations for a rosy future: In fact, many criticisms have been raised against the programs based on such issues as prediction error, poor quality of data, uncritical acceptance of analysis results, biased prediction on risk areas and people, displacement effect, privacy invasion and human rights violation. In order to address these problems, the governments of U.S., U.K., and other countries have made great efforts to construct crime big data governance. Particularly focusing on master data management and public disclosure of data, they established a principle to publicly disclose as many government data as possibleto make them readily available for people and machines across the world, based on the ‘G8 Open Data Charter.
3. Suggestions for Building an Effective Crime Prevention System
In conclusion, it must be noted that applying big data science to crime prevention is an inevitable trend of this era and future, and thus taking an open-minded approach to accept the criticisms raised against the programs and solve the related issues is the only viable course of action. In 2012, South Korean Government announced the 「Big Data Master Plan for Smart Nation」 and one of the 16 agendas was ‘Minimizing Crime Occurrences by Predicting Crime Place and Time.’ To follow through the plan, Ministry of Public Adminstration and Security initiated a BRP/ISP project to create the foundation for Big Data Polcing. It appears, however, that these efforts have not been realized in the field yet. This study expects that effective and efficient crime prevention system could be constructed soon enough in South Korea, referring to the cases of foreign countries. One thing that cannot be overemphasized is that the idea of collaboration and cyclic procedure should be the key of any big data approach to successful crime prevention.
Chapter 4. Technology changes and direction of future development of crime prevention systems utilizing big data
Section 1. Development of big data technologies
In this section, we describe the current status and concerns of the present big data analysis and then a new paradigm is proposed to address the problems.
To date, big data technique has been considered to resolve extremely difficult problems, by analyzing massive data, which were not done before. However, only building infrastructure for birdbath technique has been focused on more than solving such difficult problems. Similarly, the other concern of the current birdbath technique is that issues from the volume of the data are discussed and studied more than ones for new creative values.
Several methodologies from machine learning and data mining fields have been introduced in order to address the technical problems. One of the most famous approaches is ‘Deep Learning’ which models and infers hidden structures from the massive data. Deep learning can provide more accurate and trustable results in classification and clustering than traditional approaches like Neural Network. The other well-known approach is ‘Topic Modelling’, which can be used to automatically classify and categorize the documents and the contents in the Internet. With these new methodologies, we need to consider the direction of the future technology in the Big data. From this point of view, we introduce two additional properties of Big data: Value for meaningful data and Veteran for importance of experts. Finally, we have 5V for Big data: Velocity, Volume, Variety, Value, and Veteran.
After setting up the direction of the birdbath technology, we need to combine birdbath technique with other new technologies. First of all, birdbath should be linked with Cloud Computing. However, we should be very careful when cloud computing is used due to the information security. While cloud computing provides massive storage and cheap management to the client, the confidential or critical data like criminal records cannot be perfectly protected since cloud computing is basically stored in public. Internet of Things (IoT) is also a new technique to connect to birdbath analysis. We can use massive heterogeneous data, from various sensors in IoT environment, to catch the criminals and to reduce the crimes. In addition, the sensory data can be used to analyze the behaviors of criminals and to detect hidden crimes. Last technique to link to birdbath is Cyber Physical System (CPS). While IoT focuses on connection between small sensors and devices, CPS does the control and management of the network and connection of the devices with safety.
In addition to such connection with other technologies, we need to improve the quality of the data and to increase the volume of the data simultaneously. For instance, we can improve the quality of data by adding public data by governments and open data in public. In general, agencies and officers use only local database KICS including ‘Criminal Justice Service’ and ‘Crime Police Information System’ but they might obtain more precise and important information about the criminals by additionally analyzing and processing such public data.
Conventional Big data analysis can become richer by linking ‘open source intelligence (OSINT)’, which is used to collect information and to generate intelligence. That is, birdbath analysis becomes more powerful to create meaningful values by combining several data: public data, open data, OSINT technique.
Section 2. Limit and Concerns of crime prevention system using birdbath analysis
Big data based crime prevention system still has several limits and concerns to achieve the actual goal.
First of all, there is a lawful issue. As known, there are several databases in Korea police agency including GeoPros, CIAS, and SCAS. However, they are separated and blocked so that it is rather difficult to find the criminals, which can be detected only by linking multiple databases. This disconnection is not because of technical difficulty but because of lawful issue. However, this disconnection makes unwanted side-effect as followings:
1) Inefficient investigation: Agents are difficult to search for the extra crimes which were done by particular criminals.
2) Redundant budget: Each database can have several redundant data about the criminals so that unnecessary budget are spent redundantly.
3) Poor quality and quantity of data: Redundant database can make poor quality and quality of the data in each separated database although the combined data has high quality and large volume of the data.
These issues can happen in combining OSINT techniques and birdbath analysis. While valuable information against terrorism can be generated by the combination of OSINT and local database in USA and UK, there are few trials in Korea due to lawful issue.
With the lawful issue, there are technical issues in birdbath analysis for building the crime prevention system. The first technical problem is the concern from the uncertainty of birdbath. Although we adopt the highly sophisticated algorithms for the analysis, it is almost impossible to obtain meaningful results if the data is incorrect. Another technical concern is the difficulty to combine heterogeneous data from multiple channels including IoT sensors, multimedia and texts. The other technical concern is that crime investigation is limited on using the only local data rather than sharing the information from OSINT and public database. Additional to the technical concerns considered in building the crime prevention system, information security issues should be researched simultaneously because unwanted identification and privacy invasion can happen by extracting hidden information from anormality during the process of birdbath analysis. This issue can be more critical if cloud computing service is used to reduce the national budget and to obtain massive storage, since confidentiality cannot be maintained anymore in this situation.
Also, crime prevention system has social issues as well as technical issues because crime prediction and prevention using birdbath analysis can be regarded as a large scale monitoring system which eavesdrops whole citizens in a country. This makes a big concern in privacy issue.
Session 3. Solutions to address the technical problems in the crime prevention system
In session 2 it is expected that cloud computing will be used in birdbath system in order to use massive storage and efficient management with low budget in a close future. However, it is critically important to maintain high level of confidentiality and integrity in the data of crime data even though they are moved to public cloud system. Thus, various approaches are described in this section to provide the confidentiality, integrity and data privacy: Privacy preserving Encryption, Privacy preserving data mining, access control, methodology to defense inference attack, and a technique to detect abnormal signal in collection procedure.
Session 4. Framework of crime prevention system using big data analysis
Crime prevention system consists of four sequential steps from the collection of the physical information to the generation of the valuable intelligence.
1. Collection step (Step 1): This step includes 112 reporting, and officer’s investigation.
2. Preprocessing step (Step 2): Data are filtered and pre-processed in order to obtain refined signals and data before applying birdbath analysis.
3. Analysis step (Step 3): Data analysis is mainly performed in this step. For instance, this step includes data search in conventional database, data clustering and classification using data mining, estimation and prediction.
4. Information extraction step (step 4): After analyzing data, new valuable intelligence can be extracted from the data in this step. This intelligence will be used to detect criminals and to reduce crimes.
Lastly, we propose a detailed framework including four sequential steps. Each component in the framework is well described and designed for the effective crime investigation and prevention system.
Chapter 5 Use of Big Data for Crime Prevention and Awareness Survey Thereon
Big data is a new type of tool which constitutes one of the main pillars of social transformation that bridges the present with the future. While it has the potential to offer various benefits across diverse areas in our society, disclosure of a wide range of information poses the risk of privacy invasion. The obscurity and unfamiliarity of the concept makes it rather difficult to investigate the public awareness on the use of big data and the scope of acceptable disclosure of personal information. However, such survey is imperative under the situation where big data has already become a reality. Therefore, this Study conducted a survey on the perception of the general public and surveyed experts on big data and crime prevention through a survey and interviews to overcome the previous limitations and arrive at meaningful results.
1. Findings from General Survey
It was found though the survey of the general public that the percentage of respondents with at least a slightest knowledge of what big data is was 65%, which is not a sufficient figure to conclude that there exists a widely spread awareness on big data in South Korea. Most of the people who know what big data is, however, had somewhat broader understanding of the concept, including all data used by individuals, business enterprises, and public organizations,' rather than narrowing the scope to individuals or business enterprises. In addition, most respondents agreed with the necessity to provide personal information for crime prevention purposes, and reported their willingness to make their personal information available for such purposes. The survey findings suggest other considerations for future crime prevention policies, however, in that a large number of respondents were reluctant to provide identifiable or sensitive information even for crime prevention, and the surrounding environment, police patrol, and CCTV were cited as key factors that create a feeling of safety in residential areas. In addition, the respondents cited clearly defined legal basis as a precondition for provision of personal information, the right to be indemnified from damage suffered as a data subject, and safe processing of information as crucial factors.
Further, to questions on the concern about possible information privacy invasion caused by provision/collection of personal information for big data, the respondents were found to be sensitive about the possibility of their personal information being used, accessed, and transferred without authorization for other purposes, the importance of knowing how such information is used, and whether the accuracy of such information can be maintained, rather than government organizations' collection of personal information or how it is used. The concern for information privacy was negatively correlated with the respondents' confidence towards the government, in that the government collects personal information. It must be noted, however, that the respondents were keenly aware of the need of big data for crime prevention purposes despite their concern for information privacy.
73% of the respondents expressed their willingness to provide their personal information for crime prevention based on big data, and the confidence towards the government was found to positively affect such willingness. This goes on to show that expanding the use of big data for crime prevention requires building people's confidence towards the government. According to the findings, however, people expressed low confidence towards the government (2. 77 points), and an even lower level of confidence towards the current level of big data experts, technologies , and security. Also required are technical measures and security systems to prevent information leakage, and procedures to ensure accountability for such incident. These findings show the importance of securing the human resources and technical foundation before utilizing big data.
The construction of a crime prevention system utilizing big data is being positively perceived as an alternative to reduce crimes in our society. It is expected to be particularly effective in terms of reinforcing policing services, identifying dangerous areas, and allocating patrol workforces. Such expectation, however, is accompanied by concerns over serious threats against personal freedom and rights in the form of the possibility of privacy invasion caused by the collection and sharing of personal information by government organizations, and the possibility of resulting in a new national surveillance system or power. Such concern is widely shared by people in their 20's or 30's, people in the IT sector, and people with higher education backgrounds. In this sense, efforts for reasonable and meaningful resolution of such concerns over privacy invasion need to be made before using big data.
On another note, more than half of South Korean citizens were found to know about the personal information protection law, and 86. 6% of them were aware of the significance of protecting personal information. The highest percentage of respondents, however, cited 'possibility of personal information being used for criminal purposes' as the reason why personal information should be protected. This means that the government should relieve the people of the concern over possible injury caused by misuse of personal information, and ensure for the people of South Korea that only it uses only the necessary information in ways suited to the original purpose. South Korean citizens, regarding their rights as data subjects, put emphasis on consent to use of personal information, and the right to select and determine the scope of their consent. The more they were predisposed to disclosing their personal information, the more emphasis they put on their right to know and the right to control their own information. For entities using personal information, it seems that the remedy procedures for data subjects affected by leakage of personal information should be established, which is related with the findings from the survey on people's concern for online privacy that the respondents gave the highest point to the concern over damage caused by personal information.
To summarize, the general survey showed that while the general public are highly aware of the necessity of preventing crimes using big data and willing to offer their personal information for such purposes, such efforts should be preceded by preparing the technologies and systems required, establishing the procedures to protect the rights of data subjects and indemnify them for possible damage, and acquiring citizens' consent. Another elements required for wider use of big data for crime prevention is the citizens' willingness to offer their information, that is, their contribution to big data. Especially since people's confidence towards the government was found to affect the people's willingness to contribute, government organizations will have to gain people's trust in terms of the related systems, security technologies, expertise, and protection of the public interest.
2. Findings from Expert Survey
In addition, experts surveyed in this Study were those working in the field of crime prevention who are related with big data or conduct researches using big data. Unlike the general survey, the expert survey included questions on how big data is related with their work, the level of big data infrastructures of the organizations that they are affiliated with, the effectiveness and influence of big data, as well as the desired level of data exchange and disclosure, allocation of responsibility in case of leakage of shared data, and the infrastructures required to expand the use of big data. The survey showed that the responding experts had more knowledge about big data, with respondents working in research labs, or respondents majoring or working in the IT sector showing even higher level of awareness. They are more aware of the necessity and relatedness of big data in their professional duties. The respondents, however, gave only an average-level score with regard to the big data management capabilities of their affiliations, and many of them pointed out the lack of separate systems for big data management, suggesting the need to take the required policy actions. The respondents did not hold the government organizations' big data management capabilities at high regards, and expressed a high awareness of the need of human resources training, reorganization of organizational culture and work processes, and introduction of new legal systems as the infrastructures required for the use of big data. This tendency manifested most prominently among respondents working in police agencies and research institutes. During the in-depth interviews, the experts opined that real-time analysis for crime prevention requires an organization dedicated to big data analysis. The majority of respondents answered that the appropriate form of such organization is 'a department within a government agency,' which needs to be heeded when determining the organizational aspect of such organization.
While inter-organization exchange and disclosure of data lies at the core\ of building and using big data, the interviewed experts were of the opinion that the level of data disclosure between South Korean public organizations is low, and the laws and policies for data sharing and usage are lacking. The interviewees argued that, to ensure meaningful use of big data, the data held by public organizations need to be disclosed in full after deidentification and, in case of a leakage, the organization which leaked the relevant information should be held accountable.
Big data is also expected to be required by many areas of the South Korean society, and exert significant social and economic influences. The experts were particularly aware of the necessity of big data for crime prevention. As for the public sector, the experts expressed the opinion that big data will be most effective when used to support policy making efforts by government organizations. Despite such expectations, the finding suggest that South Korea has yet to create a sufficient big data-based environment, and is still lacking in terms of the relevant legal systems, trust between organizations, and technologies for privacy protection. Such environment is related with the infrastructures for expanding the use of big data: sufficient preparations need to be carried out for future construction of big data.
Regarding the procedures for collecting and processing personal information, the relevant rights, and the methods for acquiring consent, the experts stressed the procedural aspects of the issue while the general public expressed a more fundamental demand for clearer legal basis, while emphasizing the right to be forgotten, privacy invasion, and the remedies available for the resulting damages.
The analysis of the respondents' concern over privacy invasion, one of the most significant issue related with big data, showed that the general public was more concerned about such possibility than experts. Experts were more accepting towards how the government organizations use personal information. Such gap between the two groups can be also found in terms of their confidence towards the government. Statistically significant difference was found between the experts' confidence towards the government, the level of the government's personal information security, and the crime prevention organizations and those of the general public. The gap was even wider when it comes to protection of the public interests. Narrowing such gap between the perception of the experts and that of the general public seems to be a pivotal variable when implementing big data and personal information protection policies.
While both the general public and experts were keenly aware of the need for big data for crime prevention purposes, the gap between the perceptions of the two groups can be found across various areas. While the experts had high expectations for use of big data in terms of reduction of crime occurrences, enhancement of policing activities, arrests, efficient allocation of resources, reasonableness, and recommendation of scope of crime prevention activities, they were more tolerant towards possible issues related with the use of big data (privacy invasion, strengthening of national monitoring systems or state's power, victimization of innocent bystanders, etc.). The experts also set far wider boundaries of information to be included in big data compared with the general public. Assuming the use of big data, the general public put higher importance on 'consent of the people' while the experts emphasized the importance of 'due process. ' While experts cited 'measures to prevent spread of information leakage' as the most important measure to be taken in case of information leakage, the general public expressed a markedly different point of emphasis by citing the 'financial compensation for victims. 'However, as for the obstacles to building a crime prevention system using big data, the majorities of both groups pointed out the 'distrust towards the government organizations in charge of crime prevention,' suggesting the need to improve the confidence towards the government to form the foundation for expanding the use of big data.
The last section of the questionnaires was designed to find out the respondents' perception of the importance of protecting personal information, and both the general public and the experts were found to hold that protecting personal information is highly important. Differences between the two groups were identified across various aspects of the provision of personal information, however. The general public was concerned about personal information mainly because of the possibility of the information being used for criminal purposes, while the experts were more concerned about possible invasion of the personality, reputation, and privacy of the data subjects. As for the practical necessity of personal information in the field, the experts more readily agreed with such necessity compared with the general public. The measurement of the respondents' concern over privacy invasion in on-line spaces, including public on-line spaces, the widest gap between the two groups was found in terms of the possibility of on-line ID theft, use of leaked on-line information for criminal purposes, and concern over the possibility of financial damages. The latter two elements constitute the concern over possible loss, where the general public expressed more concern over possible damage caused by on-line information leakage than experts. On a related note, the 'reinforcement of technical protective measures' was the most highly demanded measure to prevent leakage of personal information. In addition, for users of personal information, notifying data subjects of possible countermeasures and available remedies was cited as the most important action to take in case of a leakage. Therefore, the policy-makers should take more account of the possible damage caused by information leakage along with the goal of expanding the use of big data in their efforts to create policies protecting personal information.
Chapter 6 Suggestions for Construction of Crime Prevention System Utilizing Big Data
1. Suggestions for Legislation
Legislative efforts to lay the legal foundation for a big data-based crime prevention system should be preceded by clearly articulating the logical grounds for the necessity of such legislation. In other words, such legislation needs to be justified in terms of the balance between its motivation and its purpose, the proportionality of the means against the purpose of legislation, and the inevitability of regulation. In addition, a legislation involving the use of big data must contain clear provisions defining the interests to be protected by the efforts to prevent privacy invasion, and be composed in such a way as to allow for maintaining the proportionality and balance between the related interests. Along with such review of the necessity of legislation, also requiring consideration are the effects expected therefrom, the scope of its influence, and how it is received by the general public who will be affected by such legislation.
It cannot be denied that the current legal system does not provide sufficient justification for building a crime prevention system based on big data, because there is presently no statute in place to provide the legal basis for collection, processing, and use of personal information for crime prevention purposes.
In light of the above discussion, this Study proposes the following legislative guidelines for building a crime prevention system based on big data. Firstly, such legislation needs to provide for basic principles regarding the use of personal information for crime prevention purposes, such as: statutory reservation, legality, restriction on purpose of use, accuracy, and up-to-dateness. Secondly, the legislation should clearly establish the criteria for granting exemptions to the aforementioned principles for crime prevention purposes, and the criteria for handling personal information. To be specific, a crime prevention system based on big data requires the use and handling of personal information, which may contain a substantial amount of sensitive information. Therefore, the criteria for granting exemptions and handling/using such information constitutes a core requirement for the relevant legislation in that such exemptions and use may cause serious infringement on personal information. Thirdly, the legislation needs to adopt provisions on technical and procedural obligations for personal information protection, such as: anonymization of personal information, assessment of crime prevention systems in terms of their impact on personal information, mandatory registration of systems used for handling personal information, and definition of the duties and responsibilities of the managers and staffs of information processing systems.
2. Suggestions for Policy-making
Policy suggestions for crime prevention system based on big data can be given from two perspectives: the technical perspective, and the policy perspective.
From the technical perspective, one of the prerequisites for building a crime prevention system based on big data is the construction of an organic information interface within criminal justice organizations. Government organizations have constructed their own database system for their respective purposes, which led to the inability to share or synch information between those databases or systems. Therefore, for a crime prevention system to use personal information collected by various departments or organizations, the relevant players need to reach a fundamental agreement regarding how to provide, analyze, process, and use the information stored in each database system. It should be noted, however, that building an integrated system by transferring all information from existing systems into the new crime prevention system is not the best course of action. It would be more desirable to find a way to establish organic links between the database systems and use those links to carry out the required analysis and processing works. To share and link the information stored across various personal information processing systems and use such information for crime prevention based on big data, the processing systems and the data therein must be standardized first. These efforts should be accompanied by other efforts to prevent privacy invasion by the crime prevention system, such as designing and implementing the applicable technologies and the overall operation processes in such a way to offer appropriate protection for personal information, and strictly adhering to such processes across all stages from collection to destruction.
As for the policy perspective, even the most elaborate legal, technical, and procedural processes for protection of personal information would not be able to prevent risks of civil and criminal liability for violation of the relevant laws or damages affecting the entire society, if such processes are not respected and complied in the course of carrying out daily duties. For this reason, an information protection compliance system needs to be established so as to monitor whether works related with protection of personal information are being properly carried out in accordance with statutory regulations or internal/external guidelines, ensure compliance with such regulation and guidelines, and continue to address and improve on the identified and issues.
The most significant policy consideration when building a crime prevention system based on big data would be how to overcome the people's concern about the possibility of surveillance and control by state organizations. Allowing the state to use any information to prevent crime would lead to serious ethical and legal issues, as such use of information would be tantamount to establishing a surveillance state or a police state which considers all citizens as potential criminals and preemptively restricts the freedom and rights of the people for crime prevention purposes. Therefore, when building a crime prevention system based on big data, rather than focusing on achieving high effectiveness by broadening the scope of information to be utilized, a phased approach is worth considering which limits the scope of information to the information collected by criminal justice organizations or investigation organizations in the initial phase, and gradually widen the scope of information to include other public organizations while gaining the people's trust by proving the effectiveness of the crime prevention system, and then incorporating the information collected from the private sector.