olegdudko/123RF
 

Intraspexion: Deploying Trained Deep Learning Algorithms to Prevent Litigation

  • 8 March 2018
  • Expert Insights

This post is part of our new Future of Law series which interviews the leading founders and executives who are on the front lines of the industry to get a better understanding of what problems the industry is facing, what trends are taking place, and what the future looks like.

The following is an interview we recently had with Nick Brestoff, CEO of Intraspexion.

1. What’s the history of Intraspexion? Where and how did you begin?

NB: Intraspexion was born out of many attempts to find a technology for “preventing litigation” that didn't pan out. When I was a litigation specialist, I was happy to see technology come back into my life. With prevention still in mind, I loved learning about latent semantic analysis (as embodied in Content Analyst, which is now a part of Relativity) and iterative statistical sampling, the workflow now called Technology Assisted Review.

Then I met Bill Inmon and he suggested a book, which became Preventing Litigation: An Early Warning System, etc., and which I'm proud to say was endorsed by Richard Susskind. Bill is known as the “father of the data warehouse,” and his approach was based on relational databases and taxonomies.

But during the editing process, I saw an article describing how much funding AI was attracting, and mentioned DeepMind (already acquired by Google) and MetaMind (later acquired by Salesforce) on p. 169.

Then I reached out to Richard Socher, the founder of MetaMind, and he provided two keys: an insight into the type of training data one needs for training based on text (a label and lots of examples) and an open API into MetaMind's algorithm.

Having worked with the federal litigation database (PACER.gov), I knew that there were many categories of litigation and where to find the examples of risks — in the factual allegations of complaints that had been vetted by attorneys before the case was filed.

The category I chose was Civil Rights-Jobs, which is, in other words, employment discrimination.

I started working with a very experienced software engineer, Mike Becker, to extract these examples from PACER. I knew Mike because we had both been “mentors” to the local high school robotics competition team.

From Richard Socher's team, I learned what an “example” meant, e.g., all of the paragraphs in the “facts” section of the complaint (but not more) and how many examples his system would require (200).

We backed off to 50 and baby-stepped up to 200, and then went to 400 examples.

But that was for training. Now that we had an algorithm that had been trained to “understand” discrimination, what could we test it on? I knew from my litigation experience that emails were the carriers of intent. But since discovery materials are not public, they weren't available. So (like many others), we accessed the only publicly available dataset, which was the Enron set.

Because Mike had lived in Houston when Enron failed, he chose Ken Lay, the then Chairman and CEO of Enron.

I expected nothing other than to see whether the system might function, in general. Both Mike and I were astonished when, after inputting about 5,000 of Ken Lay's emails, the system reported that 28 were “related” to a discrimination risk. That's about a half of one percent. (We're down to about one-eighth of one percent now.)

Mike sent the emails to me to review, so AI here really means Augmented human Intelligence.

I found 27 false positives and 1 true positive, and we still showcase that example in our Home page video. (We can't showcase corporate emails for confidentiality reasons.)

Anyway, I think we were the first to find this one, lone needle. It was a true needle in the haystack because the content made it clear that the writer had not yet been fired.

Mike had had a long career with IBM, HP, Compaq, Microsoft, and Hitachi, and had seen it all. But even he was surprised and thought we had something special.

So, on the day the book was published, I incorporated Intraspexion.

2. What specific problem does Intraspexion solve? How do you solve it?

NB: The first problem we're addressing is litigation and how to avoid it. Obviously, we aren't giving seminars to train employees about some litigation disaster that's just been a very painful and costly experience. By then, the horse is out of the barn.

The problem we tackled was whether we could look at emails (or any text) in almost real time and find the risks of potential lawsuits before the damage was done. That risk is why we picture it as an underwater mine. Litigation is an explosive device that's hidden beneath what used to be called the data-lake and which is clearly now an ocean of data.

Who's closest to that data and in a position to do something about it, if they only had an early warning of the risk? Corporate legal counsel. But they had no way to see that data.

So that's what we've invented, a way to surface emails “related” to specific risks, e.g., discrimination, and report them to a company-designated list of appropriate recipients. Then they can investigate the risk and advise a control group executive (someone who speaks for the client) about whether and how to address the risk.

We do this in the Shakespearean way I've described, where what's past is prologue. (That's from The Tempest, Act II, Scene 1. Why would I know this? Because my brother Richard is a Professor of Drama and Head of Acting at UC Irvine.)

So here's where “the past” comes in. We train a Deep Learning algorithm (we currently use Google's open-sourced version of TensorFlow) with past examples of specific types of litigation, as filed against others as well as against our prospective customer.

Now we're looking at a future of risk and, hopefully, in time to prevent it from escalating. That's the test data. When our system is deployed, we index and run copies of emails through a pre-trained Deep Learning algorithm (which is resident on a GPU card inside a server) and present the results “related” to the risk category back to a user.

At bottom, Deep Learning is pattern-matching.


t-SNE visualization of the training data.

It's two sets of training examples: red dots that are positive for the risk of discrimination and white dots that are negative for the risk. The algorithm needs “tuning” to keep the red dots out of the white cluster and the white dots out of the cluster of reds.

Then there's a clear boundary for decision-making, and the algorithm is ready to score emails it's never seen before.

Recently, out of 20,401 never-before-seen Enron emails, the algorithm surfaced 25, a fraction which is only one-eighth of one percent. Out of the 25, 21 were false positives, while 4 were true positives. We had already identified the 4 true positives from previous work, so they were easy to spot.

But given that result, a company could generate 2 million emails a month, and a reviewer would need to look at only about 112 emails in a given workday, which is only 16 emails per hour.

Thus, when we run copies of internal enterprise communications through our system, we are, essentially, “pinging” the ocean of data. A bounce-back is an email “related” to the risk. That's why we say that Intraspexion is to litigation what Sonar is to underwater mines.

3. What’s the future of law?

Prediction #1:  Attorneys pattern-match too. They learn appellate precedents and match new situations to them. They will continue to possess and express judgment and wisdom, without relying on machine learning. But they can't keep 1,000 precedents or 100,000 emails in mind, and now machines have very high (and going higher) “memory” and processing speed, and we will learn how they can help us. The future is coming at us more quickly than ever before and both AI in the form of Deep Learning and blockchain will be go-to tools in the Corporate Law Departments of the Future.

Prediction #2: Now and into the near future, attorneys will address and solve the problems they currently address, but they'll do it with greater efficiency. The recent rise of “Legal Ops” is an early indicator. For example, on March 5 and 6, Today's General Counsel held its first Institute focusing on Legal Ops.

Prediction #3: The problems attorneys cannot currently address will be addressable, and in large part because of advanced forms of machine learning like Deep Learning.

4. What are the top 3 technology trends you’re seeing in the legal industry?

Trend #1: I have seen AI being applied to help defendants win lawsuits (how to select the right attorney; how to choose the most favorable venue; how to forecast the outcome of a case before the Supreme Court of the United States).

Trend #2: I see AI being applied to help companies manage contracts (many now).

Trend #3: I hope to see AI being applied to help companies avoid litigation.

5. Why is the legal industry ripe for disruption?

A. The previous (and astounding) advances in AI have been based on images. Examples: DeepMind's algorithm masters all of the Atari games; Google DeepMind's algorithm (AlphaGo) plays the Asian game of Go against human champions and wins (in 2016 and 2017); and a Carnegie Mellon team develops an AI-based system to play poker and wins competitions against two different teams of human championship-level players (2017).

B. An even earlier example involved the head of Microsoft Research using AI for audio and speech translation. See this 2014 TED talk by Jeremy Howard: “The Wonderful and Terrifying Implications of Computers That Can Learn.”

C. So the legal industry is ripe for disruption because:

C1. The students who've done well in law school and comprise the legal profession are (typically) not well trained in math or software, and certainly not AI like Deep Learning. So the most hopeful assessment of AI in the legal profession is that it still hasn't hit the elbow of an exponential curve, which is when the explosive growth will take place.   

C2. The AI field, from a technology perspective, has focused more on images (driverless cars, drones, robotics) rather than word-based problems (other than recommendation systems and speech translation). Yet word-based problems will be the True North for the Legal Department of the Future. The disruption of the profession will occur when companies address the word-based problems that can't be solved now but which are worth solving.

About Nick Brestoff

Nick Brestoff is the inventor, founder and CEO Intraspexion, a Legal AI startup which was recently named in the inaugural edition of The National Law Journal”s list of Legal AI Leaders. Nick was educated in engineering at UCLA and at the California Institute of Technology, and then in law at USC, where he was a member of the Law Review. Nick enjoyed a 38-year career as a California litigator and is now retired from the profession.

Nick started writing about “Data Lawyers and Preventive Law” in 2012 and is the primary author of Preventing Litigation: An Early Warning System, etc., which Business Expert Press published in 2015. He is the lead inventor on all seven (7) patents in Intraspexion's current portfolio of software system patents where the Artificial Intelligence is the AI that's been called “the new electricity,” Deep Learning.

About Expert Insights

Comments

COMMUNITY