
Sarah Blow is a computer science graduate of Manchester University in England. Sarah is now working on bringing DataSift through its alpha testing phase in preparation for wider use. The filtering technology underpinning DataSift allows for highly refined searches to be able to take place across a selection of social networking platforms.
Five years ago, in her spare time, Sarah founded Girl Geek Dinners to alleviate the isolation that many women may feel while working in what is still a male dominated tech industry.
How did DataSift come to be?
“DataSift is pretty much the back end system that powers TweetMeme. We wanted to rebuild the engine that because we knew we could do more with it. Rather than changing TweetMeme, we created this new brand called DataSift.
Who is it for?
“The types of people we are bringing in are big financial services companies, anyone looking to do marketing and marketing analysis, agencies that are looking after brands and so on. Pretty much anything you can think of where you want to find deeper layers of information.
“While it is in alpha we are looking mainly for developers but with the mindset of bringing anyone in who is filtering and creating content. We want to find out what their needs are beyond the basics of what we’ve got there at the moment.
“We have basically asked, ”What do people want to do with the system?” And now we are looking at ways of packaging it so it can work in the right ways for different groups of users.”
What do people use it for?
“We’ve seen people use it for geo-targeting and geo-mapping content in order to find about particular brands and track them. Also, people use it to find out where their users are based. They can also find the most influential users within their particular market.
- Use example: San Francisco 49ers
“For TechCrunch disrupt we demonstrated the capability of DataSift by using publicly available information from a San Francisco 49ers game that was happening that weekend.
“There were three rules which we set up 24 hours in advance.”
- Data Collection: “We first created a base rule. That was pulling in information from everyone from at Candlestick Park. Anyone who mentioned the name San Francisco 49ers and anyone who mentioned any of the players from the 49ers and the opposing team.
“There was no geo data in that in that filtered stream.”
- Geo-location: “From there we built a second rule on top. Taking all the output from the first rule we said, “Right, now we want anything in San Francisco.” So, if someone has set their Twitter location to San Francisco we could pick them up including their tweets about that particular subject but only that subject.”
- Geo-targeting: “Then we decided, that’s good, but it’s not perfect. What we really want to know is who is in the Park that’s really seeing the cool stuff. Can we manage to get some twitpics, for example, from inside the Park from people who were actually there?
“The only way you can verify they are really in the Park is if they are actually geo-located in the Park. So we have an option on DataSift to do a single point and set a radius around it. We found the geo-target for Candlestick Park and set a one kilometre radius around it which pretty much covers that area and anyone just outside the stadium.
“But it didn’t come back with much at all. It literally came back with one user and they hadn’t done any photographs. They had just tweeted that they were there.”
That only one user was returned according to the parameters that were set up is very interesting in itself. It would have been reasonably valid to have guessed far more returns. It is always worthwhile to remember that what one assumes about a situation and what really happens may be two entirely different things.
This is why having better tools to be able to really drill down into the data and to refine and define the results is so vitally important.
- Data Collection: “We first created a base rule. That was pulling in information from everyone from at Candlestick Park. Anyone who mentioned the name San Francisco 49ers and anyone who mentioned any of the players from the 49ers and the opposing team.
- Use example: Starbucks
“We were based in San Francisco at the time. So we tried a different exercise using DataSift where we basically said, “We want to find anyone in Starbucks who has got a PeerIndex score of over 40.” Let’s see who the influential people were in San Francisco at that time and find out which Starbucks they are in today. That was a fun one to do. “
“You could do something similar with breaking news. If you knew there was a story breaking in a particular location and you are a news organisation and you want to filter down to find who the actual, legitimate sources were that were actually really in that location using DataSift would certainly be one way of doing it.”
What is the next step?
“We are aiming to have a drag and drop interface which we haven’t finished yet. Users who don’t necessarily have a strong technical ability and an understanding of the technicalities of it don’t need to. They shouldn’t need to have to have that level of detail to use the system.
“The FSDL language that we have got there we only really expect to be used by developers. It is not really aimed at the general user. But while DataSift is in alpha we’ll teach the general users how to use it in case we take a bit longer doing the other side of it.”
The work being done by Sarah and the DataSift team is promising to be a cutting edge development in information retrieval. If you want to help with their alpha testing you can still sign up at DataSift.


The simple aim of this article is to make the ideas and technology around Linked Data more accessible. The idea of approaching the subject from an imaginary Q&A perspective enabled us to avoid the constant use of acronyms in the main body of the text. (There is a glossary with links at the end.)
Lin really saved my bacon the day we did this interview. It was four in the afternoon and all my scheduled interviews for the afternoon had fallen through. On the off-chance I gave her a call and put in a request for an interview and she said, “Sure.” The outcome: A really great interview that required almost no editing.
How can one explain the popularity of this post? “Vanity is All”
Blaine gave a terrific talk at
Probably our find of the year. As I passed by
This was probably the most fun assignment of the year. It was just wonderful to see the children’s response to the virtual world technology adapted and developed by Irish company 
The interview conducted by Telepresence with Carlos in New Jersey, USA and myself in Galway, Ireland will probably be as close as I’ll ever get to experiencing the
We, at Technology Voice, are big fans of
We would like to thank
Nobody really knows what the future will hold but it is certain that the Arduino and uses it will be put to will play an ever-increasing role. It allows those with the minimum of technical ability to make really cool ideas possible. It breaks down a very important barrier between having an idea and being able to express it as best as one can.

“Mashable is now in its sixth year so it is established. Maybe not as established when placed alongside the New York Times. However, relative to many other online publications Mashable is well entrenched and well known for what it has been able to do. 






“I am interested in the application of technology and in automation and I was really interested in taking technology into a space where it wasn’t being widely used or wasn’t being used fully.
“I thought it would be quite simple to implement. As I got into it I realized there were huge complexities in what we were trying to do. Because, really what I was trying to do was behavioural analysis on the different behaviour patterns that a person has in their home. What’s normal, what’s not normal and then create alerts on the basis of what’s abnormal. Sifting those patterns from noise is quite a complex task.
“Before you get a product on to the market there is a lot of research and development that has to happen to make sure it meets a particular standard. Unlike a regular consumer type product you can’t put something like this out if it doesn’t work properly. It’s not OK to say, “It’s just got a few bugs.” 

As 
“When you are looking for a lawyer or someone like that, often when a friend recommends them to you and you end up going to the lawyer they can turn out to be not the sort of person you are looking for. We can circumvent that waste of time by being able to see all the details of the person who has the skill before you contact them.
“Even if you were looking for a corporate lawyer to float your company on the Nasdaq you’ll get a resume or a some type of CV. What people are really looking for is what projects has he or she been involved in, what role did he play, how long did it take him to do it and to see examples of the expertise that he has.