AI beyond Accuracy: Transparency and Scalability

Good AI isn’t just accurate, it is Transparent and Scalable (Thereby Trustworthy)

Published in

Towards Data Science

6 min readMay 13, 2020

Conceptualised by Joyjit Chatterjee. The cartoon/stars and business meeting elements are free stock images from Pixabay/Pexels (No Attribution Required). Please cite this Medium Article for appropriate use.

We all are very familiar with the hype which Artificial Intelligence (AI) Techniques (especially deep learning) has created globally-attributed to mostly one goal, get a higher accuracy and beat existing benchmarks. This is very prominent in almost every domain where deep learning techniques continue to be applied for instance, wherein, although the models can achieve high accuracy (in some cases, even garbage data might give you > 90% accuracy!), they suffer from the key problem of transparency, scalability and interpretability. And if your AI model is only accurate, but does not possess either of the other characteristics, is it any good?

The answer is, no, it isn’t any good in real life (unless you’re only applying it to the Iris flower dataset). That’s the sole reason why most businesses who could adopt AI are reluctant to do so- people don’t trust AI. So, where do present AI methods lack and how can we do our bit to build trustworthy AI?

Transparency

AI models, especially deep learners are nothing short of black-boxes. Feed in some data to the model to train it, it will automatically learn the patterns within the data, and once you give it some new, unseen (test data), it would be able to predict/classify (depending on whether you trained a regression/classification model) with X% accuracy on your unseen data. Now, the fun fact here is once you train a neural net model for even a few hundred epochs, in most cases, they would be able to learn patterns in all-sorts of non-linear complex data, and would give you great accuracies on your unseen, test data. But how do you know what your AI model did in getting that high accuracy? What features (parameters) in your data did the model look at? Which features contributed the most to getting that X% accuracy? Here comes the role of transparency.

Transparent AI would allow you to judge why (and how) your AI model is making a decision (or not making a decision) for your data. And how can we make our AI models transparent? There is some exciting research going on in this area:-

Utilise simple and easy to use libraries for explainable AI(e.g SHAP https://github.com/slundberg/shap, LIME https://github.com/marcotcr/lime etc.). These are some brilliant packages which allow you to identify features in your dataset which contribute to specific predictions! They provide you explainable summary of features, additive force plots etc. and intuitive visualisations to bring sanity to the black-box decisions.
Attention mechanism in deep learning models: So imagine your dataset consists of some complex images/time-series/text etc, and you’re performing all sorts of tasks in predicting things out of the black-box neural nets with high accuracy. But sadly, the conventional neural net models (be it Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs) etc., even different types of these models such as Long short-term memory (LSTMs) are black-boxes. To avoid this problem, attention mechanism comes to the rescue.

Given that you have N features in your dataset, the attention mechanism provides you the scores (weights) for the importance of the features in contributing to specific outcomes. And although this mechanism first became prominent in neural machine translation (NMT), it now extends to all sorts of data (text/audio/time-series numeric data etc.). https://medium.com/@dhartidhami/attention-model-797472ac819a is a resource explaining how attention is computed in neural models.

3. Utilise causal inference: Oh yes! Correlation is simple to compute, but is it necessary giving you the complete picture of your data and your model’s working? A big NO! Correlation doesn’t necessarily imply causation. If you need to make your AI model trustworthy, you need to identify the causal relationships within your features (what features causes a particular outcome), and how multiple features in your dataset share hidden relationships (which we, as humans cannot judge, but causal inference can help identify). https://towardsdatascience.com/inferring-causality-in-time-series-data-b8b75fe52c46 is an excellent starting point to infer causality in time-series data.

Scalability

Right, so your AI models achieves 95% accuracy! Exciting stuff. It also beats the state-of-art benchmark by 1% of accuracy, but takes an additional computing resources and training time of 15 days more than the state-of-art (that too on a GPU). Wait! Getting 1% improvement in accuracy at the cost of this intense computational resources? I think we are going wrong here. In complex industrial systems and businesses, do many people really have access to such computational power/money/resources? Most people don’t. Also, getting 1% improvement in accuracy at the cost of the computational complexity doesn’t make much sense. Had we got some additional benefits from our model, the story would have been different (benefits here signify explainability and transparency).

But sadly, the hype in AI means beat state-of-art, as has unfortunately been prominent in the research community. How to change this and make your models scalable?

Do the right thing, if simpler models (e.g. random forest) are working better than a heavily complex stacked model of complex deep learners (without providing you additional benefits), then please don’t use the deep learning model in that scenario. And if you really want to, only utilise deep learning models which can provide transparency (as outlined above, through attention mechanism, or by utilising hybrid methods combining conventional ML learners with deep learners).
Does your model adapt to new data, or is it only good for a specific segment of data? Try to make your AI models scalable by making them generic, utilise transfer learning techniques which can facilitate learning from closely related domains, to make your AI models work well in new domains. Thereby, transfer learning enables you to make predictions in new sets of data, without even requiring additional training data in some cases! (Few-shot learning is a variation of this technique, requires only a few samples of labelled training data). Here is an excellent introduction to Transfer Learning https://medium.com/@alexmoltzau/what-is-transfer-learning-6ebb03be77ee.
Remove redundant features: Yes, we know that deep learners do not require extensive feature engineering, but if you have thousands of features (and you clearly know that only 20 of them make sense logically for the data you have), then why not remove the remaining useless features manually? Or utilise some specific techniques to identify key features (https://dzone.com/articles/feature-engineering-for-deep-learning is an excellent resource) in this area?

And finally,

KEEP IT SIMPLE AND STUPID!

Don’t rush to apply your AI models directly to production as soon as they get high accuracy. It has been outlined previously that in real-life and businesses, many times, what works in your PC at home with data X, may suck in performance with a very similar data Y at the industry in practice. AI is good, but it is not always the best. We need to make it better, by thinking beyond accuracy. As researchers, engineers & data scientists, our goal should be to build AI models which are transparent, interpretable and scalable. Only then can we utilise AI for Social Good and for a better world around us.

Accuracy + Transparency = Trustworthy AI!

That’s it! Hope this post made you realise the value of Explainable AI in your day-to-day life.

If you want, you could connect with me on LinkedIn at: http://linkedin.com/in/joyjitchatterjee/