Using maths to find the culprit!🧐 The statistics of text 📖 📚 ðŸ“„🗞

Did you know that maths can be used to find patterns in text in order to help police find out who wrote a ransom note?? 🧐👮🏼‍♀️ Text can be represented as mathematical structures called networks. Networks are made up of things called nodes and edges, we can think of nodes as houses 🏠 that may or may not be connected by paths 🏘 or edges. For a piece of text, each word is a node of the network and these are joined to each other by edges when words appear near each other in the text. By comparing patterns in networks like these, it is possible to determine the gender, education level and even personality traits of the writer of the ransom note! This information can then narrow down the search for the culprit! 🕵🏻‍♂️ – –
Using statistics on text actually makes a lot of sense since this would take far too long for us to do ourselves. For example, to decide if an email is spam, the computer must have looked at thousands of emails and found a pattern in what makes a ‘spam email’. A person wouldn’t have time to sit and go through every email to decide what is spam, so we need to train a machine to do this for us automatically using statistics 💻

Other examples of uses of this technique include in plagiarism detection and in comparing novels written by different authors. This work is being undertaken by Katie Severn at the University of Nottingham 🙌🏻 – –

