21 May 2017

Identifying Patent-Eligible Software Claims... Using Software

MatrixAs I have previously reported, the major and immediate impact of the US Supreme Court’s Alice decision in June 2014 was to reduce the rate of ‘business method’ patents issued by the US Patent and Trademarks Office (USPTO) by three-quarters, while having a negligible effect on ‘technical’ software-implemented inventions.  While the data in my earlier article ended in December 2015, I have now been able to update my results to the end of March 2017, as shown in the chart below.  There has been no change in the overall trend during the intervening 15 months, with ‘business method’ patent grants still running at around 50% of 2007 numbers, while technical software patents, and patents across all areas of technology, are issuing at rates nearly twice those of 2007.
Patent Grants (March 2017)
In a recent post on the Bilski Blog, US patent agent Mark Nowotarski has made similar observations about the impact of Alice on ‘business method’ patent grants, going on to analyse the characteristics of those patent claims are are still being allowed in the USPTO’s business method Art Units.  He has noted that it is commonly by including ‘physical limitations’ (e.g. reciting hardware such as ‘mobile devices/sales kiosks’ or ‘physical sensors’) and/or ‘software limitations’ (e.g. reciting technical functionality such as ‘graphics/ image processing’ or ‘cryptography/security’) that applicants have been able to overcome Alice-based subject-matter rejections.

This got me thinking.  If there are common forms of technical language that arise in patent-eligible claims, then might it be possible to train a machine-learning system to predict whether a particular claim is, or is not, likely to be patentable?

It turns out that this does indeed appear to be possible.  I built a machine-learning model using data published by the USPTO, including the claims of 24,462 recently-abandoned 4967 recently-allowed applications, all examined within ‘software’ and ‘business methods’ Art Units.  In cross-validation tests (i.e. using a portion of the known data for training, and the remainder to test model performance) I was able to achieve around 75% prediction accuracy.  In trials of a hand-picked ‘random’ sample of more recent patents and published applications, not in the training/test set, the model correctly classified all actually allowed claims (of four examples) as patentable.  It also classified the claims of four abandoned applications as unpatentable, and two published-but-rejected-and/or-amended claims as unpatentable.  In only one case did the model classify a claim that had been rejected on subject-matter grounds as likely-patentable.

The model may thus be capable, with a probability of success of over 70%, of determining whether or not a proposed claim to a computer-implemented invention includes sufficient technical content to overcome a subject-matter-based rejection, at least under the Alice test as it is applied by the USPTO.

Source Data

The USPTO’s Patent Claims Research Dataset (PCRD) is available free for download, and contains detailed information on claims from US patents granted between 1976 and the end of 2014, and from US patent applications published between 2001 and the end of 2014.  Bibliographic and status information for all published US patents and applications is available in the USPTO’s Patent Examination Data System (PEDS).  I have written about PEDS on a previous occasion – it contains much of the data that is accessible online via the Public Patent Application Information Retrieval (PAIR) system.

Building the Training Data Set

To build my training data set, I identified all patents and applications assigned for examination in any of the ‘software’ or ‘business methods’ Art Units of the USPTO according to the same definitions I have used previously, i.e.
  1. ‘software’ Art Units are 2120-2129 (Miscellaneous Computer Applications), 2140-2149 and 2170-2179 (Graphical User Interface and Document Processing), 2150-2169 (Data Bases & File Management), and 2190-2199 (Interprocess Communication & Software Development);
  2. ‘business method’ Art Units are 3621-3629 (Electronic Commerce), 3680-3689 (Business Methods), and 3690-3699 (Business Methods – Finance).
Within these Art Units I looked for all applications filed after 1 July 2010 (i.e. subsequent to the Supreme Court’s Bilski decision) and which were either allowed or abandoned after 1 July 2014 (i.e. subsequent to the Alice decision and rapid drop-off in business method patent grants).  I considered relevant allowed applications to be those having any one of the following status descriptions in PEDS/PAIR:
  1. Patented Case
  2. Notice of Allowance Mailed -- Application Received in Office of Publications
  3. Abandoned  --  Failure to Pay Issue Fee
  4. Publications -- Issue Fee Payment Verified
  5. Publications -- Issue Fee Payment Received
  6. Awaiting TC Resp., Issue Fee Not Paid
  7. Awaiting TC Resp, Issue Fee Payment Verified
  8. Awaiting TC Resp, Issue Fee Payment Received
Abandoned cases are those having either of the following status descriptions:
  1. Abandoned  --  Failure to Respond to an Office Action
  2. Abandoned  --  After Examiner's Answer or Board of Appeals Decision
For the abandoned cases, I extracted the first independent claim of the pre-grant publication from the PCRD (where available). For allowed cases I considered three different options:
  1. the first independent claim of the granted patent publication (available in the PCRD only for patents issued prior to 31 December 2014);
  2. the first independent claim of the pre-grant publication (available in many more cases, but not necessarily representative of the claims ultimately allowed); or
  3. a combination of the above, i.e. the patented claim where available, otherwise the claims as published.
Interestingly, it turned out that the performance of the model was not much affected by which of these three options I chose.  Possibly the benefit of using claims that are known with high confidence to be patentable (i.e. the granted claims) is offset by the larger amount of information available in the pre-grant publication data set.  My preference is to use the granted claims only, since this seems to be a less arbitrary, and thus more logically defensible, choice.

This resulted in a training data set containing 24,462 abandoned claims and 4967 granted claims.

There is an obvious asymmetry in this data: while we know that the granted claims are patent-eligible (because an examiner allowed them), we have no idea why the abandoned applications were abandoned.  In some cases the applicant might have given up due to insurmountable subject-matter rejections, in other cases there may have been different rejections, e.g. lack of novelty or obviousness, while in still others the applicant may simply have lost interest in prosecuting the application.  This will most likely limit the model’s ability to learn to distinguish specifically between claims directed to eligible or ineligible subject matter.

Machine Learning Model and Performance

For the machine learning system itself, I compared a number of different models, with a range of parameters, to see which performed best, including Naïve Bayes Bernoulli and Multinomial classifiers, Logistic Regression, Support Vector Machines, Random Forest classifiers, and basic Neural Networks (Multi-Layer Perceptron).  I have written about machine learning and cross-validation on my other blog, in the context of a spam classifier example.  Suffice to say that I used the same technique to assess comparative performance of different patent claim classifier models.  The ‘winning’ model was a classifier based on Ridge Regression (a.k.a. Tikhonov Regularization) which, with some optimisation, was able to achieve around 75% successful prediction in cross-validation.  Specifically, I obtained the following confusion matrix for the classifier.
This table shows that the rate of false-positives (i.e. abandoned applications that were predicted to be patentable) and of false negatives (i.e. patented cases that were predicted to have been abandoned) are both around 25%.  As I have already noted, however, it is impossible for the classifier to perform perfectly because there is an unknown proportion of applications that are abandoned for reasons that are unrelated to the specific wording of the claims (which is the only information from which the model can learn).  There will be claims to patent-eligible subject matter in the group of actual abandoned applications.  Some of these will be ‘correctly’ classified as patent-eligible, but will appear as false positives in cross-validation.  Conversely, the classifier may ‘learn’ from these cases that the patent-eligible claims are ‘unpatentable’, which will lead to false negatives.

Tests Using ‘Unseen’ Claims

I hand-picked a few patents/applications from an unsorted list of cases that were recently patented or abandoned, and for which there are no claims in the PCRD (which, as noted above, includes only claims published up until the end of 2014).  These cases were therefore completely unseen by the model during training.

In each case, I obtained the main independent claim from the USPTO’s Full Text Databases, and reviewed the file history in the Public PAIR system.  I ran the claim text through the classifier (now trained using the complete training set, without cross-validation), and compared its prediction with the actual outcome in each case.  The results are summarised below.

US publication no. 20170017714 claims ‘a method of automatically generating tags for media content’, and was allowed on 13 April 2017.  The classifier predicted that published claim 1 is patent-eligible.

US publication no. 20150193830 is directed to ‘systems and methods for optimizing marketing decisions based on visitor profitability’, was rejected on subject matter and other grounds, and has been abandoned.  The classifier predicted that published claim 24 is patent-ineligible.

US publication no. 20150178844 is entitled ‘Customized Retirement Planning’, was rejected on subject matter and other grounds, and has been abandoned.  The classifier predicted that published claim 1 is patent-ineligible.

US publication no. 20150235156 is directed to ‘a method of enabling capacity on demand in a computing system using a calendar’, was rejected on subject matter and other grounds, and has been abandoned.  The classifier predicted that published claim 1 is patent-ineligible.

US patent no. 9,652,538 is directed to a ‘web crawler optimisation system’, and was issued on 16 May 2017.  The classifier predicted that granted claim 1 is patent-eligible.  Interestingly, the considerably more abstract claim 1 of the pre-grant publication no. 20150161257 was predicted by the classifier to be patent-ineligible, although it was not rejected on this ground in the first Office Action (in which the primary ground of rejection was obviousness).

US patent no. 9,646,257 is entitled ‘Probabilistic assertions and verifying them’, and was issued on 9 May 2017.  The classifier predicted that both granted claim 1 of the patent, and claim 1 of the corresponding publication no. 20160063390 are patent-eligible.

US publication no. 20150242590 is entitled ‘System for and Method of Providing Healthcare Services’, was rejected on subject matter and other grounds, and has been abandoned.  The classifier predicted that published claim 1 is patent-ineligible.

US publication no. 20150178292 is entitled ‘Methods and systems for data serialization and deserialization’, was rejected on subject matter and other grounds, and has been abandoned.  The classifier predicted that published claim 1 is patent-eligible.  It therefore made a ‘wrong’ prediction in this case.  However, the claim recites ‘a method for data serialization, comprising: obtaining a first metafile; obtaining structured data to be serialized; and serializing the structured data based on the first metafile, the serialized data following a format of (length, value) or a format of (value) for each data field.’  While this claim is absurdly broad, and I would not expect it to be novel and nonobvious, it is not immediately apparent to me that a method of this kind, for encoding information, is inherently patent-ineligible.  The model might therefore be excused for disagreeing with the examiner!

US publication no. 20150089406 is entitled ‘Methods and apparatus for user interface optimization’, and is due to be issued as patent no. 9,658,735 on 23 May 2017.  The classifier predicted that published claim 1 is patent-ineligible.  This may be correct, however that claim was never examined because it was cancelled in a preliminary amendment.  The claim that will issue as claim 1 of the granted patent, which is set out below, was predicted (correctly) by the classifier to be patent-eligible.

A system for user interface optimization, the system comprising: a rules base configured to store a plurality of rules that define an application having a user interface; a rules engine configured to execute at least one rule from the rules base; and a digital data processor in communication with the rules base and the rules engine, wherein the system is configured for: identifying one or more rules for execution by the rules engine so as to generate any of a markup language page providing a user interface and a markup language stream providing the user interface, determining whether one or more aspects of the user interface generated as a result of execution of the one or more rules is in conformity with one or more requirements, wherein the one or more requirements are defined relative to any of (a) one or more other rules and/or a user interface generated based thereon, (b) transactional data relating to the user interface, (c) a context in which the user interface is any of transmitted, displayed, and viewed by a user, and (d) a collection defining any of grammar, spelling, usage, punctuation, and style of the user interface; responding to a negative such determination by executing any of: i.  generating a notification that identifies modifications to the one or more rules so as to generate at least one of the markup language page and the markup language stream providing a conforming user interface, wherein execution of the one or more rules would otherwise result in a non-conforming user interface, ii. modifying the one or more rules so as to generate the at least one of the markup language page and the markup language stream providing the conforming user interface, and iii. modifying the at least one of the markup language page and the markup language stream providing the conforming user interface, and any of storing to and generating as output from the digital data processing system at least one of the generated notification, the modified one or more rules, the modified markup language page, and the modified language stream providing the conforming user interface.

Overall in this handful of tests, the classifier correctly predicted patent-eligibility in all four examples of allowed/patented claims.  It was also correct in all but one case in relation to rejected/abandoned claims.  This performance is consistent with the 75% success rate indicated by the cross-validation tests.

Conclusion – An Encouraging First Effort?

The machine learning model described in this article could be the basis of a useful tool for patent applicants and attorneys.  Based on cross-validation and further tests, it certainly appears able to predict whether a propose claim is, more likely than not, to recite either patent-eligible or ineligible subject matter.  All else being equal, given the choice between submitting a claim that ‘passes’ classification and one that does not, an applicant would have better prospects with the ‘passing’ claim.

It is entirely possible that with more and better data, e.g. including claims published since January 2015, and with additional information about grounds of rejection and reasons for abandonment, an even more effective classifier could be developed.


Post a Comment

Copyright © 2014
Creative Commons License
The Patentology Blog by Dr Mark A Summerfield is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Australia License.