Zero-Knowledge Private Machine Learning on Bitcoin

This post was first published on Medium.

Previously, we demonstrated running a full-fledged deep neural network on Bitcoin, where both the input and model of the machine learning (ML) algorithm are public. In practice, it is often desirable to keep the input or model off-chain and thus private, while ensuring that the ML algorithm is run faithfully. We achieve this by using Zero-Knowledge Proof (ZKP) on ML.

Zero knowledge on the machine learning chain

Zero knowledge on the machine learning chain

There are two categories of private information when it comes to ML.

Private input

The input to the model is hidden, but the model itself is public. This is particularly useful for applications involving sensitive and private data such as financial records, biometric data (eg fingerprints, face), medical records and location information. For example, one can prove that he is over 21 years of age without stating his age. Or an insurance company uses a credit score model for loan approval. The model is published for transparency, but the inputs, such as the applicant’s salary and account statements, should be kept confidential.

Private model

The input to the model is public, but the model itself is private, often because it is intellectual property. For example, we use a tumor classification model owned by a private company to detect tumors from images. The model is certified to have 99% accuracy when classifying a public data set. The company can only publish the cryptographic commitment of its model, i.e. the hash of all model parameters. We can be sure that the model is legitimate, while not seeing it. The cryptographic commitment also ensures that the same model is applied to everyone, for fairness. This is desired, for example, in an admissions model that ranks candidates based on their public information.

ZKP is a natural fit for maintaining privacy when using ML on-chain, because it can hide off-chain information while proving ML inference is correct.

Classification of handwritten digits

As a demonstration, we have implemented a simple model for classifying handwritten digits. The model was trained using labeled examples from the MNIST dataset. The architecture of the model is very similar to the one we used for our fully chained model.

ZK Circuit diagram

We use ZoKrates to build ZK circuits, which can make all inputs private trivially, by simply declaring it using keywords private.

Private input

From the above code we can see that the inputs to the model, model_inputsis passed as one private parameter, meanwhile the model parameters (weights and biases) are public. When we send input to the model, the circuit performs all the model operations on the data and outputs the model’s prediction/class.

Private model

The following is the code to make the model private.

Here, instead of sending the model’s input data, we send the model’s parameters themselves as private. Using these secret parameters, the circuit performs all necessary operations of the model and compares the results with a group of test samples. If the model reaches a certain classification accuracy (CA) threshold, the execution will succeed.

The full code for both the first scenario and the second scenario can be found on GitHub.

Summary

We have demonstrated how to exploit the ZK property of zk-SNARKS for chain machine learning. This allows us to hide specific parts of the ML calculation.

References

width=”560″ height=”315″ frameborder=”0″ allowfullscreen=”allowfullscreen”>

See: BSV Global Blockchain Convention panel, Blockchain for Digital Transformation of Nations

width=”560″ height=”315″ frameborder=”0″ allowfullscreen=”allowfullscreen”>

New to Bitcoin? Check out CoinGeeks Bitcoin for beginners section, the ultimate resource guide for learning more about Bitcoin – as originally envisioned by Satoshi Nakamoto – and blockchain.

You may also like...

Leave a Reply

Your email address will not be published. Required fields are marked *