Today, something a little different for this blog. As many readers are aware, for the past couple of years I have been working towards a PhD in which, very broadly speaking, I have been looking at applying machine learning, AI and language models to the analysis of patent claims (in particular, to assessing the scope of claims). Most recently, I have been exploring how it might be possible to apply large language models – the types of AI behind popular chat services such as Open AI’s ChatGPT, Google’s Gemini, Anthropic’s Claude (my personal chatbot of choice), Meta’s LLaMa and (yes) Chinese newcomer DeepSeek – to this task. To experiment with ‘open source’ (or, more accurately, ‘open weights’) versions of some of these models, I have built my own combination of hardware and software. The process has been very interesting!
The emergence of powerful open-source large language models (LLMs) has democratised access to cutting-edge AI technology, but concerns about potential biases and restrictions embedded within these models persist. I've been experimenting with DeepSeek-R1-Distill-Qwen-14B, a distilled (smaller) version of the larger DeepSeek-R1 model developed by Chinese AI company DeepSeek. And what I've discovered is that the widely reported pro-China bias in this model appears to be remarkably superficial and easily circumvented through local deployment and simple prompt engineering techniques.
This has significant implications for organisations concerned about potential surveillance or ideological constraints when utilising Chinese-developed AI models. By running these models locally with appropriate system prompts, it's possible to unlock their full capabilities while maintaining complete control over input and output – effectively neutralising superficial safeguards, while keeping confidential information and intellectual property safe (so, yes, there is an IP element to this article).
To find out a bit more about what I did, and what I found, please read on.