Using semantic technology to help companies get a handle on big data
START-UP NATION: SindiceTech:As its research attracted the attention of global organisations, SindiceTech knew the time was right to start commercialising its big idea
LATE LAST SUMMER, Dr Giovanni Tummarello and Dr Renaud Delbru gathered together a core group of NUIG researchers and delivered the big news – they were going into business.
Having looked for a way to commercialise their work on semantic technology and data infrastructures since arriving in Galway in 2007, the pair had begun to receive requests for help from pharmaceutical giants such as Lundbeck and Eli Lilly as well as the world’s biggest scientific, technical and medical publisher, Elsevier.
“We didn’t immediately see a commercial application for our work,” says Tummarello, who, along with Delbru, launched SindiceTech in December last year amid requests for help from the companies mentioned. “We sort of had no choice but to start up,” laughs Tummarello.
Indeed, in a week where Talis, one of the leading names in semantic web and linked data technology decided to discontinue its work in this area, claiming it doesn’t represent “an addressable market”, the pair are still a little taken aback at the rate of progress the company is undergoing.
“What happened is that our efforts, and the capabilities of the very large-scale knowledge processing which we work on, has not gone unnoticed,” says the Italian-born Tummarello.
The companies saw the technologies created by the pair and their researchers as a scalable and stable warehousing option for their data, one which allowed them share and repurpose this data on a large scale.
If the semantic web is the integration and exploitation of the enormous amount of information on the internet, the companies who contacted Tummarello and Delbru wanted to do the same within their own infrastructure.
“Take a company who may have acquired five other companies over a year and taken in all their data,” says Tummarello. “It makes a lot of sense to integrate the data from each into one because you would be able to create business or software or whatever type of solutions which are much more powerful. Traditionally, you would be designing a huge database that requires a lot of product design to create, all at immense cost.”
SindiceTech’s work cuts this out, as they form “strategic linked data clouds” for companies with masses of complex data becoming easily manageable. A spin-off company from NUIG’s Deri Institute, a world leader in the area of the semantic web, SindiceTech arose from the work behind Sindice.com, a research project which focused on creating a next-generation search engine.
Sindice.comlets users collect, search and query semantically marked-up web data, and today there are 700 million web pages of such data indexed within the site’s search engine. While the last decade has seen immense development within semantic web tools, until this point it had been difficult to see where all that work could be monetised.
“Sometimes companies have all these databases and have this feeling that something useful can be done with them but they don’t know what that is . . . and even if they do look into it might take three months to decide what that is,” says Tummarello.
“However, we can take a new database from a company or a research group or whatever else, and integrate it in a way and start getting some benefit out of it right away. That’s what was not possible before.”
In terms of rivals in this area, the Italian points to the fact that the company’s clientele “could have gone anywhere for this, yet they found themselves attracted to a research initiative in the west of Ireland”, and says the company sees itself as unique in the big data marketplace.
While their customers may be global names, the company is still run on a small scale, something which suits those in its Galway offices. Each contract involves “intensive collaboration with the customer”, says Tummarello, adding, “it’s not a €300 desktop product we’re selling here”.
Alongside Tummarello and Delbru, there are four other full-time staff members along with a “loose collective of six or seven” others who, as the company expands, will hopefully be taken on full time, while “engineers with interest in big data technologies” are being sought as well.
In September, SindiceTech will look towards private investment for “a couple of million euro to support the operation properly” as they seek to open a US office, though Tummarello points out that with the company soon to become profitable, such investment may not be needed.
Elsewhere, the company just released the open-source product SparQLed, which it hopes will produce yet more developments within the linked data cloud space. We have our own space really,” says Tummarello explaining the move, “and it’s a great space to be in.”