Making the right connections

SCIENCE FOUNDATION IRELAND: A SCIENCE FOUNDATION Ireland-funded strategic research cluster based at UCD and NUIG is carrying…

SCIENCE FOUNDATION IRELAND:A SCIENCE FOUNDATION Ireland-funded strategic research cluster based at UCD and NUIG is carrying out world-leading data analytics research which has applications in developing new treatments for cancer, detecting financial fraud, and making social networks more robust and efficient.

Clique is focused on graph and network analysis and visualisation. Its industry partners include IBM, Idiro Technologies and Norkom Technologies. With total funding of more than €5 million from SFI and its industry partners, Clique has more than 20 postdoctoral researchers and PhD students on staff.

The group’s research is strongly influenced by the commercial challenges faced by its industry partners. Clique aims to conduct research and provide solutions to real-world problems and its industry partners facilitate research activities by providing access to voluminous data as well as information about the characteristics of real data, which is then used to directly validate outputs.

“The whole area of data analytics has a big buzz about it at the moment,” says Clique director Prof Pádraig Cunningham. “This is because there is so much data out there. Accenture has set up a data analytics centre in Dublin, the IBM Smarter Cities centre is all about data analytics, and PayPal is looking at setting up a centre here is well. Clique looks at networks; the objects in them and the relationships between them.”

READ MORE

For social networks these can be tweeters, followers and retweets on Twitter; friends, pages and likes on Facebook or connections on LinkedIn. In biology they can be the groups of genes and cells that interact with each other to switch on a gene which causes a certain type of cancer.

The group’s name is a deliberate pun which will be readily understood by mathematicians. “In life, a clique is a small tightly knit group,” Prof Cunningham explains. “But it also means something in graph theory. It is a network where all the nodes within it are linked directly to each other.” No degrees of separation in other words.

The group looks at data in three contexts – social networks, financial transactions and biological networks.

One collaboration in the social media area which Clique is working on at the moment is with Storyful, the online news service founded by former RTÉ journalist Mark Little. Storyful is different to other news services because it takes much of its content from ordinary people or so-called “citizen journalists” who are tweeting or putting information about newsworthy events on their Facebook pages.

Indeed, much of the coverage of the Arab Spring by a variety of news organisations came from these sources.

“Twitter is a very important tool for Storyful,” Cunningham points out. “They might know of two or three Twitter users who may be useful in providing information on a certain story. We analyse the data on the network to identify other users who might also be useful for that particular story.”

The users are identified by their connections and activity. This is very important for a service such as Storyful which relies for much of its content on non-professional contributors. The more people contributing to one story the easier it is to verify and corroborate it.

In the financial fraud arena it is all about detecting the patterns of transaction that are indicative of fraudulent activity. Fraud also occurs in the social media or online information sphere as well.

Tripadvisor.com brands itself the world’s most trusted travel site and as such must ensure that all information on it is reliable.

“You can think of Tripadvisor as a social network with the nodes being the various hotels and destinations listed as well as the reviews posted on it,” says Cunningham. “We work with Tripadvisor to help them identify fraudulent or dodgy reviews. What happens from time to time is that a hotel might get some friends to put up some good reviews. In fact, it is estimated that between 5 per cent and 15 per cent of the reviews on the site may be bogus. We look at the patterns of the reviews and can identify which ones are likely to be dodgy.”

Possibly the most exciting area of work for Clique is in the biomedical sphere. “It turns out that the analytics for social networks are the same in biology and we are looking at biological networks,” Cunningham explains. “There was an idea that there was a specific gene that caused cancer. This was wrong but what we now know is that it is networks of genes that are involved. We try to understand how these networks work. The trick might be in stopping a gene from activating and that might prevent cancers in certain situations.”

It is not a case of simply targeting the gene which might be the final link in the chain before the cancer starts but in influencing the entire network which causes that gene to behave in certain ways in the first place.

Clique is currently moving into the second phase of its activity. “We are a strategic research cluster and we are very much industry facing and focused,” says Cunningham. “At the moment we are looking for new industry collaborators like Storyful to come on board in the three main areas of social networks, finance and biology.”