Skip to main content

From the Data Rush to the Data Wars: A Data Revolution in Financial Markets

Atlanta, GA

Sept. 27, 2018

Georgia State University College of Law – Henry J. Miller Distinguished Lecture Series

Thank you Dean [Wendy] Hensel for that kind introduction. It is an honor to be chosen to give the 62nd Henry J. Miller Distinguished Lecture at Georgia State University College of Law. It is a testament to Henry Miller himself that his legacy continues to advance a public dialogue on so many important topics and I’m pleased to be part of that discussion. I would also like to thank a few of my colleagues for being here tonight—Bill [Dixon], Aaron [Lipson], and Richard [Best], who work in the Securities and Exchange Commission’s Atlanta Regional Office.

This evening, I will speak to you about data—a topic as germane to the industry I help oversee—the financial industry—as it is to other professions, such as medicine, science, and engineering. Before I jump into the deep end of the data pool, however, let me pause to say that I am speaking today as an individual Commissioner and not on behalf of the U.S. Securities and Exchange Commission.

Our financial markets need two ingredients to function properly: trust and information. When either is constricted, the financial markets can seize up. We have seen this many times before, from the Great Depression almost a century ago to the Great Recession a decade ago. Scarce and unreliable information was one of the major problems that led to both of these major market disruptions. When investors did not know what they were actually invested in, or that rampant conflicts of interest existed, trust dried up.[1]

In response to the 1929 stock market crash and subsequent market turmoil, Congress passed the “Truth in Securities Act”—also known as the Securities Act of 1933.[2] This landmark legislation required companies to disclose certain types of information when investment contracts or securities were being sold. It also prohibited fraud and misrepresentation in the presentation of that information. In effect, it improved the quality of information, or data, in the market—and it fostered trust.[3] The Securities Exchange Act of 1934 followed. It specifically stated that the new law was necessary to “insure the maintenance of fair and honest markets . . . .”[4] The legislation that followed the Great Recession was the Dodd-Frank Wall Street Reform and Consumer Protection Act. Like its predecessors in the 1930s, this seminal legislation shined light where there once was darkness by increasing transparency in the derivatives and other markets.

The increased transparency that resulted from these pieces of legislation was one of the great transformations in the securities markets. It allowed investors to trust the markets. This trust, in turn, allowed financial services companies to thrive and seek new and novel ways to remain competitive. Countries around the world have used our regulatory system as a model for the regulation of their own financial markets.

Fast forward to today, and the prolific availability of data and information has disrupted and transformed the capital markets. Financial services companies look nothing like they did in the ’20s and ’30s. Stock exchanges, securities brokers and dealers, investment advisers, and other key participants in the securities markets now look and act more like technology companies. In fact, today’s investors may only interact with a software program or smartphone app when making investment decisions or executing transactions. As financial services companies have transformed into FinTech companies, technology companies are also beginning to enter the financial services space. Some call them “TechFins,” because they were technology companies first.[5]

These changes have only accelerated in the last decade. Financial services companies are harnessing the data that exists—from client preferences to idiosyncrasies in market trends—in order to continue to grow. Many of these data points have existed for decades. But companies, governments, and even individuals, have radically enhanced their ability to extract, use, and manipulate data in new and increasingly value-added ways. In other words, financial services companies and others are mining and capitalizing on both their own data, and the data of others.

It makes sense that these transformative changes are provoking new and complicated questions about data ownership, use, availability, and protection. In order to oversee the financial markets with insight and intelligence, the Commission I am a member of, the U.S. Securities and Exchange Commission needs to start grappling with some of the potential answers to these questions. We need to be able to adapt with our own RegTech.

Data Rush

Thousands are seeking to make their fortunes by exploring new ways to collect, parse, and analyze data. Similar to the Gold Rush in the 19th Century, we are in the midst of a Data Rush. Start-ups and established players are all searching for future fortunes with a persistence seen only a few times in history. And like the American Westward Expansion, in many ways there have been no rules, no boundaries, and seemingly unlimited possibilities. But, establishments in new areas have begun to take hold, and both people and companies have started to ask for a little more law and order.

With tens of billions of devices being connected together[6] and a growing Internet of Things (IoT), data is ubiquitous, and is the foundation of many of the Internet-based tools we use. It enables research in virtually all fields of study. The technology to collect data is available to many and is virtually boundless. In some ways, it is like the Force in the movie Star Wars: “It surrounds us and penetrates us. It binds the galaxy together.”[7]

Machine learning is being deployed to sift through patient medical records to identify adverse health clusters.[8] Doctors are using facial recognition algorithms to diagnose rare genetic conditions.[9] Law firms are using machine learning and algorithms to recognize prior litigation patterns.[10] The accumulation of DNA in databases is now being used to solve crimes faster and more accurately than any point in history.[11] And investment firms are using robo-analysts to conduct “more comprehensive analysis . . . than ever before” to help recommend investment options for consumers.[12]

On a more personal level, many of us are mining data every day without thinking much about it. We use an app to determine the number of calories, the sugar content, or the ingredients in a particular food. We own wearable technology that tracks our heart rate or the quality of our sleep at night. And we compete against others to get in more steps than a colleague. And we aren’t alone. Just as you focus on generating, collecting, and analyzing those steps, FinTech, TechFin, and RegTech firms are competing on their ability to generate, collect, analyze, and act on data.

Our growing dependence upon data and the tools that analyze and manipulate it are having a profound impact on humanity. The “Data Rush” is quickly, but silently, revolutionizing our nation and our world, just as the gold rush transformed our country. It is disrupting and transforming all that it touches, including our financial markets, financial firms, and investors.

While we focus on what data allows us to do—whether it is a new game or health app—battles are being fought over who owns the data, who should have access to it, how it can be used, and whether and how it should be protected. In effect, data has been commoditized. And like the gold rush of old, people are fighting over the most valuable data sets. That is because in data, there is meaning. For example, from securities transaction data, analysts can extract trends, patterns, or correlations. The ability to better forecast the future is, of course, at the core of every financial professional’s dream.[13] Because of its predictive abilities, data has tremendous value, which can ultimately translate into tremendous profits.

FinTech firms and established market participants—such as the securities exchanges, dark pools, and internalizers, as well as broker-dealers, investment advisers, and others—know that data is extremely valuable. Not only are these entities users of the data, but they collect troves of it. Data is packaged and sold to participants across the financial markets. How fast the data is delivered, how comprehensive it is, and how it can be used to advantage market participants are factors that translate into its price. To give you an idea of the scale of interest in data and technology, one large investment bank dedicated 50,000 people and $10.8 billion just this year to technology.[14]

During the Gold Rush, one could witness the pioneers and prospectors traveling by foot, horse, or wagon train in droves to the western half of the continent. It is harder to visualize the pace and scale of the digital transformation occurring now. The billions of lines of source code and the untold number of networks that carry data are largely hidden from view. But it’s there. Notably, 90% of the world’s data has been created in the last two years alone.[15] And this data is traveling faster and more covertly every day. Data, including stock market data, that was previously transmitted through paper, fax, and dial-up Internet connection is now being transferred via fiber optic cables, microwaves, and low-orbit satellites.

This race to collect and control data is intensifying. Many firms and individuals are rushing onto the course but few are thinking about what the rules of the race should be. Most are focused on the potential benefits and not on the potential costs or unintended consequences. Just as driverless or autonomous cars are guided by computer code instead of humans, so are a majority of our securities transactions. And we have not changed our regulatory paradigm. How does this affect the Commission’s mission to protect investors?

This race for data superiority may be creating two classes—those who can pay for data and those who can’t. My point is that our financial system’s growing dependence on vast amounts of data and the tools that analyze it are significantly changing our financial markets. Our financial regulations need to change as well. As a result of this new Digital Age, we need a different kind of regulation for the future—smart, agile, and intelligent regulation that focuses on those that threaten our markets, while at the same time supporting the innovation that drives economic growth.

Underlying many of the battles in this data race are arguments that demonize policymaking of any type. But, as a society, we need to start answering some of the following questions:

  • Should a company value its data?
  • Should it disclose the value of its data?
  • Who is responsible for the appropriate collection and use of data?
  • Who is responsible for protecting the privacy of personally identifiable information that is collected and used?
  • Who is responsible for determining how data can be shared?
  • Who is responsible for establishing and implementing minimum standards for data collection and use?
  • Who is responsible for addressing inherent conflicts of interest?

This responsibility can reside in a number of places. But if everyone is responsible, then no one is responsible. So, we need to start answering those questions. If we fail to implement changes that make the system fair and transparent, trust in our markets may erode and uncertainty may take hold, just like it did in 1929 and in 2008.

Laying the Groundwork for the Data Infrastructure of Tomorrow

The prospector towns established during the Gold Rush eventually needed laws and rules. The same is true here. We need to establish guideposts so that individuals and companies can better and more cost-efficiently explore the new frontier, while being mindful of the collateral impacts on our entire community.

I am hopeful that the Commission can help resolve some of these issues by being more creative and forward-leaning than it has ever been. We must be able to oversee a marketplace where technology companies and machines exchange large amounts of data at the speed of light.

With today’s trading occurring in microseconds in a market dominated by computerized and automated trading, we cannot rely solely on the human eye to detect problems. Small mistakes can become big problems very quickly. Accordingly, the Commission needs to prioritize the completion of the Consolidated Audit Trail, or CAT. When complete, the CAT will receive approximately 58 billion records per day, making it the world's largest data repository of information on securities transactions. The CAT will allow orders to be tracked throughout their life cycle, from order entry to trade execution. It will also identify brokers involved in the orders and help reveal misconduct. The CAT will have the ability to transform market surveillance and our understanding of the market, much as the Hubble Space Telescope has transformed our view of the universe.

While progress on the CAT is being made, I believe the staff of the Commission should also begin to think strategically about the future collection and use of data. We need to have a data vision and strategy if we want to be an effective regulator in the 21st Century.[16]

We also need to be good stewards of data and live up to the best standards in the world for protecting it. Recent advances in cryptography now allow data users to share information while, at the same time, preserve its confidentiality.[17] For example, secure multi-party computation (SMC) can help protect the privacy of the firm or individual submitting its data to the Commission. SMC allows sensitive data to be collected in an encrypted manner. The data can then be analyzed without ever revealing the firm or individual from which it originated.

The Commission could also use SMC technology to evaluate financial metrics without revealing the individual contributors. This methodology could help monitor and identify potential problems, such as concentrations or exposures indicative of systemic risk, increases in aggregate leverage, or reductions in market liquidity. By deploying this technology, the Commission could more easily identify threats to the financial markets by scoring concentration ratios or crowded trades. This can be done, all while protecting the anonymity of the firm or the proprietary nature of the data.

Risk-based analysis and oversight is a proven way to detect fraud, prevent wrong-doing and, ultimately, protect investors. For example, models can be constructed using data from clickstreams, which are essentially obtained from a series of computer mouse clicks. These models can then be used to help identify targets for scams and predatory practices. In addition, examining social media sites may allow the Commission to intervene earlier in Ponzi schemes. The Commission’s Office of Compliance Inspections and Examinations and several of the other Divisions have been using data-driven risk-based examinations and investigations for some time now, but we have only scratched the surface.

The Commission has taken small steps to collect better data in other contexts. For example, in 2016 we adopted reporting requirements for mutual funds and exchange-traded funds that require the use of XML, which is a form of structured data.[18] Structured data is human-readable information that has been converted to a format that a computer can more easily read and understand.[19] Likewise, our Form PF, on which hedge fund managers report, also uses XML. Just this past summer, the Commission adopted rules requiring operating companies, mutual funds, and exchange-traded funds to report certain financial, expense, risk, and performance information in a better form of structured data. Slowly but surely, the Commission is transitioning its reporting requirements towards a more data centric model.

Moving towards a world of structured data benefits all users, which is why it is difficult to find reasons why information submitted to the Commission should not be in a machine readable format. For instance, investors, with the help of data aggregation platforms and tools, can ultimately use this information to compare investments—like many of us do when we shop online. Academia, data aggregators, and governments can use this type of data to, for example, conduct research more quickly and to answer difficult and complex questions about systematic risk at the aggregated level with much less effort. And companies, funds, and other issuers can use the data to run data analytics across their peers, analyze their own risk profiles, and develop investments that better align with investors’ goals.

While data and data analytics can help the Commission prevent fraud, protect investors, and inform decision-making, it also helps inform intelligent and effective policy-making. I strongly believe in using facts to design good public policy. As such, analyzing and understanding data is critical to shaping future regulation. As the primary regulator of the financial markets, the Commission should employ a multi-disciplinary approach to analyzing and understanding market data. By deploying data scientists to work alongside the lawyers and accountants of the Commission, I believe the Commission will embark on a new model for enhancing both the efficiency and the effectiveness of its oversight.

Cyber Wars

If you will allow me to switch gears a bit, I would like to talk about what happens when data and computer systems housing that data are not protected. Attacks in the cyber domain are one of today’s most prevalent form of technological warfare,[20] and I’d like to paint the picture of the battleground’s front lines.

Imagine that you have your morning latte and are ready to start your work day. You sit down at your computer, but, instead of your normal login screen, you see a black screen and the statement, “repairing file system on C:”. Nonplussed, you restart your computer, which seems to have no effect. When you call your company’s IT help desk to diagnose the issue, you learn that this problem is affecting nearly every person in all of the company’s more than 500 offices in 130 countries around the globe. Your company is at a standstill.

This happened last year in a malware attack that cascaded from network to network, and from company to company. It disrupted the global supply chain, led to the shutdown of American shipping ports, and affected the United States’s supply of vaccines. The White House estimated the incident caused over $10 billion in damages.[21]

World-wide connectivity, growing software and security complexity, and expanding attack surfaces mean that cyber threats are ever-present and increasingly costly. The World Economic Forum estimates that the cost of cybercrime to businesses will climb over the next five years to $8 trillion per year.[22] Just to put that number in perspective, that’s about $24,000 for each person living in the United States. And that estimate doesn’t account for the total economic loss, such as lost revenues, or lost trust. Investors and the Commission aren’t just concerned with the dollar value of the damage or immediate remedial efforts. They are concerned about how cyber threats affect the long-term future of the companies. Was valuable intellectual property taken? Will customers continue to trust the company with their personal information? Was sensitive data exposed? And what happens if a large number of investors, all at once, decide not to trust a company, or the entire financial system?

One of the most important investor protection initiatives remains protecting the privacy of consumers and their information. Protecting sensitive data—such as names, addresses, account numbers, and Social Security numbers—is of paramount concern. The press highlighted this concern recently in a report describing computer vulnerabilities on certain online investment platforms. Some of the most prevalent vulnerabilities demonstrate a lack of attention to basic protections, such as a failure to encrypt balances, portfolios, and personal information.[23]

We also should modernize our regulatory approach regarding the overall collection, protection, use, and sharing of personal investor data. Much has changed since 2000 when the Commission first issued privacy rules.[24] These rules merely require “firms to adopt written policies and procedures to protect customer information against cyberattacks…” They do not, however, require firms to actually protect customer information or to notify investors when their information has been compromised.[25] Are investors fully aware of how and where the data is being used? Do investors know who has their data and for what purpose they have it? The Federal Trade Commission has begun a series of hearings on Consumer Protection in the 21st Century. The Securities and Exchange Commission can, and should, be equally proactive in its oversight of broker-dealers, investment advisers, and other market participants so that investor data is appropriately protected.

Ensuring that our markets have operational readiness to withstand large-scale disruptions is also critical. In November 2014, the Commission issued Regulation SCI, which stands for systems compliance and integrity. Reg SCI required certain critical market participants, such as exchanges and clearing platforms, to establish written policies and procedures reasonably designed to ensure that their computer systems can maintain their operational capability in the event of a disruption. For example, Superstorm Sandy, which battered New York City and other locations in 2012, caused a major disruption in the financial markets, including the closure of U.S. national securities exchanges for two days.[26] The 2014 rule was intended to prevent this from happening again, but it didn’t go far enough. The rules left out many participants. I would like to see this rule expanded to cover other key market players that possess investor information, such as broker-dealers, investment advisors, and transfer agents. Accordingly, I have asked Chairman Clayton, to prioritize what I call “Regulation SCI 2.0.”

For years we have encouraged “written policies and procedures,” “voluntary” frameworks, and “codes of conduct” to deal with cyber threats. The Commission has offered non-binding “guidance” and advice to market participants.[27] While advice can be helpful, both government and businesses are in a new world. We need to think more comprehensively about the cyber wars going on. All need to up their game to protect our critical systems, personal data, and economy from cyber threats. Tepid responses from government and businesses are invitations that cybercriminals simply cannot ignore.

Being indifferent about cybersecurity is simply not enough.[28] Boards have a fiduciary duty to shareholders. Shareholders and policymakers expect boards of directors to oversee and to evaluate corporate risk-taking. Board members need to proactively take action on the oversight of cybersecurity as a critical component of a company’s risk management.

I am not saying that the Board must manage the day-to-day risk of cyber threats. However, Boards must take charge of the oversight of cyber risks. In particular, Boards must consider whether their members have the appropriate digital acumen to carry out this important responsibility. For example, Commission rules require public companies to disclose whether boards of directors have at least one financial expert on their audit committees. Likewise, boards should consider whether they have an independent member with expert knowledge of technology and cybersecurity. If not, Boards should retain independent experts to provide it with advice. Furthermore, independent directors should meet with the company’s Chief Information Security Officer at least twice annually in executive session, without members of management present so that they can have open, frank, and meaningful discussions about culture, tone, and the resources dedicated to both prevention and resiliency.

Additionally, boards must also assess whether disclosures to shareholders adequately and faithfully represent the significant cyber risks that may impact investment decisions.

With the increasing likelihood that breaches and attacks will happen, boards should be particularly focused on the company’s resiliency. How will the company respond? How resilient is the infrastructure? What are the procedures for recovery and resumption?

At the Commission, we talk a lot about trust in our financial institutions, in the companies that register with us, and in the professionals that recommend investment transactions, like brokers and investment advisers. And for good reason: the market only functions when that trust exists. And the Commission must help protect the sanctity of that trust.


In closing, I would note that these issues don’t merely represent existential concerns; rather, they affect you, your money, and your identity. Thinking about whether and how to regulate both data, and the market participants who use it, is critical to ensuring a sound and stable financial system. It is critical to attract investors and businesses from around the world. Without it, trust can erode like it did almost a century and a decade ago.

Thank you for your time, and for inviting me to speak with you this evening.

[1] “[T]here must be an end to a conduct in banking and in business which too often has given to a sacred trust the likeness of callous and selfish wrongdoing. Small wonder that confidence languishes, for it thrives only on honesty, on honor, on the sacredness of obligations, on faithful protection, on unselfish performance; without them it cannot live.” Franklin Roosevelt’s Inaugural Address, March 4, 1933, available at “By developing expertise in gathering relevant information, as well as by maintaining ongoing relationships with customers, banks and similar intermediaries develop ‘informational capital.’ The widespread banking panics of the 1930s caused many banks to shut their doors; facing the risk of runs by depositors, even those who remained open were forced to constrain lending to keep their balance sheets as liquid as possible. Banks were thus prevented from making use of their informational capital in normal lending activities. The resulting reduction in the availability of bank credit inhibited consumer spending and capital investment, worsening the contraction.” Chairman Ben S. Bernanke, The Financial Accelerator and the Credit Channel: At the The Credit Channel of Monetary Policy in the Twenty-first Century Conference, Federal Reserve Bank of Atlanta, Atlanta, Georgia (Jun. 15, 2007), available at

[2] 15 U.S.C. § 77a et seq.

[3] Willian O. Douglas, Protecting the Investor, Yale Rev. (Mar. 1934) (stating that “even though an investor has neither the time, money, nor intelligence to assimilate the mass of information in the registration statement, there will be those who do so, whenever there is a broad market. The judgment of those experts will be reflected in the market price”), available at

[4] 15 U.S.C. § 78b.

[5] Dirk A. Zetzche, Ross P. Buckley, Douglas W. Arner & Janos N. Barberis, From FinTech to TechFin: The Regulatory Challenges of Data-Driven Finance, 14 N.Y.U. J.L. & Bus. 393 (Spring 2018).

[6] See Jeff Desjardins, Cybersecurity: Fighting a Threat That Causes $450B of Damage Each Year, Visual Capitalist (Nov. 28, 2017), available at

[7] Star Wars Episode IV: A New Hope (Lucasfilm Ltd. 1977).

[8] Allie Nicodemo, Paging Doctor Data: Machine Learning and the Future of Healthcare, News@Northeastern (Aug. 17, 2017), available at

[9] Megan Molteni, Health Care is Hemorrhaging Data. AI is Here to Help, Wired (Dec. 30, 2018), available at

[10] Josh Becker, 4 Ways that Law Firms Benefit from Legal Analytics, LexisNexis Legal & Professional (2018), available at

[11] Federal Bureau of Investigation, Combined DNA Index System (CODIS), (last visited Sept. 26, 2018).

[12] David Trainer, Why Robo-Analysts, Not Robo-Advisors, Will Transform, Forbes (Jul. 19, 2017), available at

[13] I should note, however, that past performance is not necessarily an indication of future results. See 17 C.F.R. 230.482(b)(3)(i).

[14] Laura Noonan, JPMorgan’s $11bn fintech bazooka, Financial Times (Sept. 24, 2018), available at

[15] Innovation Enterprise Channels, Infographic: Big Data In Everyday Life, How do we use Big Data in our day-to-day lives? (Mar. 19, 2018), available at

[16] Commissioner Kara M. Stein, A Vision for Data at the SEC: Keynote address to Big Data in Finance Conference (Oct. 28, 2016), available at; see also Commissioner Kara M. Stein, Disclosure in the Digital Age: Time for a New Revolution (May 6, 2016), available at

[17] Andrew W. Lo, Moore’s Law vs. Murphy’s Law in the Financial System: Who’s Winning, J. of Investment Mgmt, Vol. 15, No. 1, 17-38 (2017), available at

[18] Investment Company Reporting Modernization, SEC Release No. IC-32314 (Oct. 13, 2016), available at

[19] What is Structured Data?, U.S. Securities and Exchange Commission (last viewed Jun. 22, 2018), available at

[20] Kate O’Flaherty, Cyber Warfare: The Threat From Nation States, Forbes (May 3, 2018), available at

[21] Andy Greenberg, The Untold Story of NotPetya, the Most Devastating Cyberattack in History, Wired Magazine (Aug. 22, 2018), available at

[22] See World Economic Forum, Global Risks 2018: Fractures, Fears and Failures, available at; see also Juniper Research. 2017. The Future of Cybercrime & Security: Enterprise Threats & Mitigation 2017-2022.

[23] Alejandro Hernandez, Are You Trading Stocks Securely? Exposing Security Flaws in Trading Technologies, IOActive (Aug. 7, 2018), available at

[24] See Regulation S-P, 17 C.F.R. § 248.30, available at; see also FINRA, Topic Page: Cybersecurity, (last visited Sept. 26, 2018).

[25] An NMS stock ATS (alternative trading system) must notify the public when confidential trading information is compromised. If an NMS stock ATS’s public disclosures “materially differ from the actual means by which the NMS Stock ATS protected the confidential trading information of subscribers, the ATS would be required to file an amendment pursuant to Rule 304(a)(2) to revise its Form ATS-N to accurately describe such safeguards and procedures.” Regulation of NMS Stock Alternative Trading Systems, Release No. 34-83663 (July 18, 2018), available at

[26] See Regulation Systems Compliance and Integrity Proposal, Release No. 34-69077 (Mar. 8, 2013), available at

[27] Commissioner Kara M. Stein, Statement on Commission Statement and Guidance on Public Company Cybersecurity Disclosures (Feb. 21, 2018), available at

[28] Catalin Cimpanu, Maersk Reinstalled 45,000 PCs and 4,000 Servers to Recover From NotPetya Attack, BleepingComputer (Jan. 25, 2018), available at

Return to Top