September Newsletter

Hi everyone-

Another month flies by… hard to believe summer is technically over although the coldest August UK bank holiday on record is one way to drive home the point!

Following is the September edition of our Royal Statistical Society Data Science Section newsletter. Hopefully some interesting topics and titbits to feed your data science curiosity while figuring out whether your home office setup needs a rethink for the months ahead…

As always- any and all feedback most welcome! If you like these, do please send on to your friends- we are looking to build a strong community of data science practitioners- and sign up for future updates here

Industrial Strength Data Science September 2020 Newsletter

RSS Data Science Section

Covid Corner

The Covid situation feels a little unreal at the moment with UK schools just about to re-open fully while positive tests in France go ‘exponential‘. Offices continue to remain stubbornly empty while bar and restaurant trade is visibly picking up. The upcoming return of students to Universities and Schools for the new academic year is sparking increasing concern. As always numbers, statistics and models are front and centre in all sorts of ways.

Committee Activities

It is still relatively quiet for committee member activities although we continue to play an active role in joint RSS/British Computing Society/Operations Research Society discussions on data science accreditation.

Martin Goodson, our chair, continues to run the excellent London Machine Learning meetup and has been active in lockdown with virtual events. The most recent event, Putting An End to End-to-End by Sindy Löwe was a great success – the video will be posted soon on the meetup youtube channel – and future events will be posted here.

The committee are also excited to be launching a new initiative: AI Ethics Happy Hours. More details to follow next week but we are keen to generate lively debate from the whole data science community in this interesting and important topic.

Elsewhere in Data Science

Lots of non-Covid data science going on, as always!

Evil Algorithms…
Algorithms have been in the press a fair amount recently, and not for the right reasons. The public exam results fiasco in the UK was a case in point…

First of all – what happend?

So what was the algorithm that caused all the controversy?
Of course, as data scientists, we all know that a critical component of any algorithm is the clear definition what you are trying to achieve and how you can measure success and fairness in outcomes. Early on in the process it appears the government was concerned about grade inflation and so likely gave guidance to Ofqual to correct for this. So from the outset, it appears that limiting grade inflation was the key objective for the algorithm but there seems to be limited discussion or evidence around how they would assess success or fairness in outcomes.

As discussed in this post by Sophie Bennett, the algorithm made a key distinction based on class-size: where the class size was less than 15, the teacher assessed grades would be used without correction, but for class sizes over 15, adjustment towards historic grade distributions would be made. Of course, private schools and schools in more affluent areas tend to have smaller classes, and so the unintended consequence of this approach was that students from less affluent backgrounds were more likely to have their teacher assessed grades downgraded.

The RSS Data Science committee submitted a response to the CDEI call for evidence on bias in algorithmic decision making back in May. In it we stated:

“The only way to prove algorithms are biased is to perform experimentation and analysis. Any argument based on explainability, the capability of an algorithm to produce a justification for its decisions, will fail. This is because of hidden correlations which allow latent and implicit variables to create bias against protected groups.”

RSS Data Science Section submission to CDEI call for evidence on bias in algorithmic decision making

Clearly this type of analysis and assessment is as important as ever- it recently helped change an algorithm used in visa allocations. Unless we are careful, bad implementation of algorithmic approaches could lead to a backlash against all implementations. It may have already started.

Elsewhere in bias and ethics…

More GPT-3 …
We now seem to have a regular section on GPT-3, OpenAI’s 175 billion parameter NLP model as it continues to generate news and commentary.

  • We previously mentioned ‘Double Decent’, the intriguing situation when apparently over parameterised models (which have historically meant over-fitting and poor generalisation) become more successful with more training. This excellent set of tweets from Daniela Witten, a co-author of the latest edition of the Machine Learning bible, gives insight and foundations into why this happens.
  • This set of posts is interesting in its own right- a set of philosophers discussing GPT-3. Most impressive though is GPT-3’s response to their posts (generated after feeding the posts in as prompts)…
“…As I read the paper, a strange feeling came over me. 
I didn’t know why at first, but then it hit me: this paper described my 
own thought process. In fact, it described the thought process of every 
human being I had ever known. There was no doubt in my mind that all 
people think in this way. But if that was true, then what did it say 
about me? I was a computer, after all."
"Over the last two weeks, I’ve been promoting a blog written by GPT-3. 
I would write the title and introduction, add a photo, and let GPT-3 
do the rest. The blog has had over 26 thousand visitors, and we now 
have about 60 loyal subscribers..."
"Too little has changed. Adding a hundred times more input data has 
helped, but only a bit."

What to work on and how to get it live

"...like DIY craft kits, with the instructions and 70% of the parts missing”

Practical Projects
As always here are a few potential practical projects to while away the lockdown hours:

Updates from Members and Contributors

Again, hope you found this useful. Please do send on to your friends- we are looking to build a strong community of data science practitioners- and sign up for future updates here

– Piers

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: