Sysero speaks up – beyond the AI hype cycle

We first introduced Sysero back in 2012 after 5 years of struggling to make SharePoint work as a KM system.  Our move from Sysero 1.0 based on SharePoint to our own proprietary 2.0 system took several years, and in the intervening time between then and now we have added hundreds of features, many based on direct feedback from our clients.

We made some decisions back in 2012 that have proved their worth 13 years on.  For example we chose the Lucene as our text search engine.  It works on standard hardware (no $billion data centre required), provides blistering fast performance and having 10,000’s using our implementation every day, has proved very resilient. We also designed a flexible classification and workflow system that today supports compliance processes involving 100’s of data points and process steps.

If I’m honest I’m a bit smug about that. So when I first heard the term Retrieval Augmented Generation (RAG) 9 months ago whilst working with a new law firm client, I wasn’t that excited.  At over 1,200 staff the client represents the top end of the UK market and have built up an enviable Knowledge Management library.  As their current KM platform is no longer supported, they went looking for a new solutions and happily selected Sysero Cogent.  One of the main reasons for their decision was due to the premise that the management of knowledge requires structured workflows to manage content and categorisation for search and retrieval.  Having already invested the time and staff to build a library covering over 20 legal practices, they knew that maintaining the quality of their knowledge libraries was going to be a challenge that traditional document databases don’t address.

Having delivered a couple of dozen KM systems to law firms in the UK, Europe and the US in the last decade, Sysero Cogent demonstrably meets the requirements of fast flexible search, integration with the major LawTech DMS, CRM and PMS systems, and the workflow features mentioned earlier.  What we lacked though was AI.  In fact until I started working with this client I was an ardent AI denier.  Having seen the money wasted on the AI wave of the last 10 years, I had firmly maintained  the position that KM was about providing intuitive search and content management.  That’s when I was forced to look at an approach that involves sending selected documents into a Large Language Model to help in document drafting, summarisation, research and compliance.

As we already had a system that collects and maintains a law firm’s proprietary know how, implementing RAG was actually quite easy. Our Version 1 RAG approach was to allow users to select documents using their own classifications and send them into LLM’s like OpenAI.  Sysero allows firms to implement their own categorisation systems so users could select say, policies and procedures in IP, and answer specific questions asked by their clients using their in-house expertise.  I showed this to the client and bingo we closed the deal.

However this lead me down the rabbit hole.  If this AI stuff is as good as they say it is, how are law firms going to differentiate themselves from ChatGPT?  How am I going to stay in business if my clients are going to be replaced by the likes of Gemini and even Alexa, who it was announced this week is emerging from her egg timer phase into all powerful AI goddess in the next few weeks.

So I took the blue pill (or is the red one) and spend 6 weeks in AI isolation, consuming every bit of knowledge on Hallucinations, Tokenization, Embedding, Dimensions and Named Entity Filtering (I do this so you don’t have to).  I ended up with a the solution to a problem we’ve had for years but never really acknowledged.  That issue is similarity.  As a peddler of Knowledge, we’ve always been delighted with the choice of search engine we made 13 years ago.  In that time DMS juggernaut iManage have gone through at least 3 other search engines before they settled on the same one we have had for years.  As mentioned before, Lucene is incredibly fast and has lots of features, but is basically a system where words go in and documents containing those words come out.  Over the years we’ve added features like keyword phrases and a thesaurus but fundamentally these don’t handle concepts.

This is where the new bread of Semantic search engines comes in.  They differ from keyword search engines in that they break down documents into “chunks” (and yes that is the technical term) and automatically classify those chunks using dozens of categories automatically.  AUTOMATICALLY!, no people involved! Really? Now that is really cool, as our KM systems require people to do that manually and it’s a lot of work.  Turns out it’s not all good news as the classifications used by semantic search engines (called embeddings) are designed by tech geeks, not people who know how to classify documents in a way that suits your law firm. So I’m afraid manual classification is still a good idea if you want to find M&A Precedents not real Estate Contracts with 100% accuracy. However what that does mean is that any questions your lawyers or clients have can be matched to your existing knowledge (classified by all that hard work) to extract relevant information.  As it comes from your KM system you know its “real knowledge” (not a hallucination) and can cite the source along with all that useful KM metadata like how long ago was it published, who wrote it etc

So a semantic search engine is stage 1 of a RAG system.  Technically it’s stage 2 as stage 1 is embeddings but I’m cutting out some stuff in the name of brevity. It compares your questions with its knowledge and gives you chunks of information that are similar to the question.  You can ask any natural language question, and even include misspellings and typos in your query and it will still give you a search results list that will help you answer your questions.  Stage 2 (technically stage 5 but I’ll shut up now) is the flashy AI bit.  This is where we send over only the relevant chunks to the AI and it uses that to provide a natural language answer to a natural question.

So obviously this is cool but why bother with all of this when you can spend €20/month on Claude?  Well whist ChatGPT, Claude, Deepseek et al can answer any question on any subject, they don’t have your domain knowledge and often as not they make stuff up.  AI’s really want to please and will make up stuff they don’t explicitly know.  The do this through deduction and it’s very clever, but what they don’t know they often just assume, and we all know how that goes don’t we? In fact there are many articles stating that rather than getting better, newer and larger AI’s are suffering from hallucinations more and more.

Implementing a RAG system, especially for firms that already have a KM system (or even a DMS Database called “knowledge”) can be done relatively quickly and gives mid-market firms a massive opportunity over both their smaller, and weirdly larger competitors.  The fact you have managed knowledge that is tailored to your specialisms, and only your specialisms, means you can answer questions and fix issues with speed and accuracy.

Phil Ayton is CEO of Sysero and can be contacted through the contact us on Sysero.com.

Sysero provides document automation, workflow automation, contract management and knowledge management solutions to law firms.