ALGOPATENT - Toward Synthetic patents

Disclaimer: this article does not constitute legal advice. My views and understanding may have changed since the date of publication. Readers should consult qualified professionals for advice tailored to their specific circumstances.

Towards Synthetic Patents [pdf]

by François Veltz - Algopatent

Published on LinkedIn October 2024

Executive summary

Qatent, originating from INRIA, pioneered AI verticalization for patent drafting three years ago, ahead of the ChatGPT surge, addressing generic LLMs' limitations in handling textual subtleties. Key hypotheses include indistinguishable machine-human texts, superior co-authored patents, and robust synthetic applications through claim variants. From claims or invention disclosures, Qatent generates complete documents using specialized models, with tools for claim crafting and post-editing. Economic benefits encompass reduced time-to-file, low costs, formal verification, enhanced examination robustness, and portfolio consistency. Emerging developments feature patent-trained generative models, science-to-patent pipelines, prior art-guided interactive drafting, multimodal drawing annotations, sentence classifications, inventive pattern reinjection, problem-solution algorithms, writing metrics, terminological amplification, coherent longform generation, jargon-free rewriting, text densification, defensive publications, cross-domain analogies, and claim morphing. Companies mastering these will dominate algorithmic IP.

Towards Synthetic Patents ?

Most Large Language Models are generic: they are not particularly well suited to drafting patent applications. Few companies are tackling the last mile, i.e. integrating these various tools (LLM, LVLM, etc.), whether proprietary or open code, to draft patent applications, whose text is riddled with subtleties that practitioners know and generally master.

Stemming from INRIA, qatent began this arduous verticalization three years ago now, well before the chatGPT buzz, with different hypotheses in mind. One, machine text and human text are indistinguishable. Two, future patent applications will be texts co-written by man and machine. The machine leaves nothing out, can provide lists of technical alternatives, and can check, correct, enrich and densify human texts. Three, synthetic patent applications are superior linguistic objects, because they are more robust to examination, in particular through the accumulation of claim variants. From claims, or even an invention disclosure, qatent calculates and generates a complete patent application, with each section of the text calculated by one or more AI models. qatent also provides various tools to facilitate claim drafting and post-editing of the generated text.

The economic and legal value propositions are clear and decisive: drastic reduction of time-to-file, very low-cost drafting, verification of formal irregularities in place of or in addition to peer review, enriched drafting and increased robustness for examination, portfolio consistency, and so on.

For this new continent, there are many entirely new developments. The challenges and perspectives are fascinating, from the point of view of patent laws, the very scientific content of applications and the AI technologies involved.

From the short term to the long term, we can mention:

generative models trained on patent data alone;
articulation of the patent corpus with the corpus of scientific articles (from preprint to patent application in a few clicks for fast provisionals);
interactive generation, and in particular drafting guided by the maximization of differences given a designated prior art document;
integration of prior art search and text generation;
implementation of multimodal AI, e.g. automatic annotation of drawings e.g. by computer vision;
sentence-by-sentence classification of a patent document;
interactive intermediate generalizations;
re-injection of inventive patterns;
algorithmic encoding of the "problem-solution" approach;
algorithmic oppositions and third-party observations;
implementation of writing quality metrics (e. g. good and bad practices);
amplification of terminological trends detected in patent classifications, etc.;
coherent text generation ("longformers");
rewriting of the technical teachings of the patent corpus currently obfuscated by "legalese" jargon;
text densification (e.g. injection of alternative words);
text generation in volume for defensive publication strategies (e.g. freedom to operate), whether or not accompanying one or more official administrative filings;
transpositions or reasoning by analogy between technical fields, and
“claim morphing” and semantic space paving operations.

Companies that master the algorithmic manipulation of claims, and therefore of detailed descriptions, will gain a decisive advantage in this sovereign and strategic field of algorithmic IP.

Page updated

Report abuse