Geocode search results fall out of specified radius

kurrik
@kurrik Arne Roomann-Kurrik

Splitting https://dev.twitter.com/discussions/3360 into two threads:

From @JaimeCard:
"It also appears that the radius parameter isn't working either, I limited the radius to 1km of my location and the results returned included loads of tweets from outside the requested area. Check out this http://search.twitter.com/search.json?q=&geocode=52.814225,0.695762,1km&rpp=100"

From @AppQuickie:
"It is still very much broken, maybe even become worse:
Here is an sample query http://snap.apigee.com/udmifT
Its for geolocation in India: 12.9005,77.5942,1.0km
an example in the result from_user_id 22143739 12.936665,77.605367
which you can see in this map as being much more than 1km radius
http://maps.google.co.in/maps?saddr=12.9005,77.5942&daddr=%2B12%C2%B0+56'+11.71%22,+%2B77%C2%B0+36'+21.67%22&hl=en&ie=UTF8&ll=12.91991,77.600384&spn=0.050111,0.077162&sll=12.919158,77.600727&sspn=0.050111,0.077162&geocode=FZTYxAAdWP6fBA%3BFYplxQAdgyygBA&vpsrc=0&gl=in&mra=ltm&t=m&z=14
If you look at the results, you will find that most of the tweets are from America
from_user_id 298704650
from_user_id 270621096
from_user_id 168727404
from_user_id 365517170
from_user_id 261412592
just to point out a few
The result has tweets from Spain, Australia, Russia just to give you an idea, that this is still very much broken! Please FIX"

From @Openbronnen:
"The search in the center of The Hague (http://search.twitter.com/search.json?callback=?&result_type=recent&page=1&rpp=20&q=*&geocode=52.0781563203831,4.313850699999989,0.250km) returns tweets from al over The Hague but not one in the circle. The documentation of GET geo/search is updated on Fri, 2011-11-04....?"

From @clayosborne:
"http://search.twitter.com/search.json?q=&geocode=42.963866000000003,-85.529331999999997,1.50mi&rpp=100&result_type=recent
zero inside the circle.
http://search.twitter.com/search.json?q=&geocode=42.994352338967111,-85.628814697265625,1.50mi&rpp=100&result_type=recent
zero.
http://search.twitter.com/search.json?q=&geocode=41.822841421701440,-87.733722945601784,1.40mi&rpp=100&result_type=recent
zero
A full list of 100 tweets are returned in all cases, none of which actually fall inside the radius. The radius is surrounded by "nearbye", irrelevant, results."

2 years 20 weeks ago

Replies

Openbronnen
@Openbronnen Openbronnen

See this example: http://code.google.com/intl/nl-NL/apis/maps/articles/mvcfun/twittersearch.html
Set the radius on 500 meters and hit search. Zoom to see the tweets outside the circle. The circle is empty!

2 years 20 weeks ago
AppQuickie
@AppQuickie Quickie

Any updates here? 5 weeks already for this issue. Is Twitter fixing anything?

https://dev.twitter.com/issues/progress , there are 7 issues in progress, almost all haven't been updated in more than a month

https://dev.twitter.com/issues/progress , 24 acknowledged issues with ETA: unknown

Are you out of developers? We work in application development as well and have never come across anything like it, where bugs take this long to resolve. It's unbelievable! Are you guys using punch cards for programming? or using x286 machines? There must be a reason for this bottleneck and we would really appreciate if its disclosed as to why something that was working until 31st of October 2011 didn't work anymore from the 1st of November 2011 and on 8th December 2011, is still not working, with ETA:unknown. This is very hard to comprehend. Why don't you just tell us to go screw ourselves and shutdown the API completely, at least then there will be a closure.

2 years 19 weeks ago
episod
@episod Taylor Singletary

Not all issues are created equally. Not all resolved issues ever surface to the issue tracker. These issues effect us just as they effect you. I'm sorry that the issue you care most about is taking longer than other issues to completely resolve. We have many developers but they aren't all working on the special union of Search and Geo.

The why that something breaks: because we're continually re-engineering the back-ends of our product from being a monolithic whole into a composition of several pieces that inter-rely on each other to formulate responses. This affords us many benefits, but while individual components are introduced into play, it takes time to find the balance in their interactions.

Not all issues are "bugs." Some things we're not working on in a direct sense -- some are issues with the API that result from the organic growth/development of the API and the difficulty in making changes when there are more than 1 million applications now utilizing it. The API is merely an interaction framework for underlying systems that have to handle an amazing scale of traffic and over 250 million incoming tweets per day.

2 years 19 weeks ago
SkillsyOz
@SkillsyOz SkillsyOz

Sadly, I am experiencing the same issue on our brushfire map of Australia and this is an example of our search url....
http://search.twitter.com/search.json?q=&geocode=-34.998,142.247,10.0km ...

This returns tweets from mainly the US sometimes and Indonesia or Malayasia. We use this to see tweets occurring around a bushfire and this is just not returning what we require. We find more errors where tweets are lower however this may be random. With the season on us, can I ask/plead that we can rectify this issue please.

2 years 13 weeks ago
SkillsyOz
@SkillsyOz SkillsyOz

Over the past two or three days, there has been a marked improvement in the results being returned and most appear to be within the confines requested. Occasionally the results are still being returned from well outside the area and from different countries but we appear to be heading down the right track. Thanks to the devs in almost fixing the issue.

2 years 10 weeks ago
clayosborn
@clayosborn Clayton Osborn

I have seen no change what-so-ever, and I monitor this constently conducting searches all over the globe. Out of 100 results in a returned dataset, most of the time 0 are within the radius, and sometimes 3 or 4 will fall inside "only" because they were just posted a few minutes ago (results are sorted by newest to oldest, only reason they're at the top). They will quickly fall off the radar as more recent posts in the general area take over the 1 through 100 spot returned by your search results. If you're doing multiple requests of more than the first page, you'll see them stay in your results a little longer, but they will still eventually fall off pretty quickly.

If you're seeing more fall inside the radius lately, it's only an illusion perpetuated by a sudden increase in the tweet volume taking place within your search radius.

If people tweet within your radius often enough, those new tweets will top the most recent posts returned within your request's 100 results. I have rarely seen this as well, where one or two users inside the radius I'm checking will send out a bunch of tweets withint a few minutes, putting a bunch of results inside the radius for a short amount of time. But that's all it is, is a battle for the most recently tweeted content, and have very little to do with geolocation inside the radius.

If you're in a fairly unpopulated area, you'll see the tweets inside your radius hang out a little longer, simply because there aren't enough tweets happening in the general area to knock them down the most recent list as fast.

2 years 10 weeks ago
kurrik
@kurrik Arne Roomann-Kurrik

Yes, I still certainly see queries which return results out of the search radius. For the most part they seem to be a relatively small order of magnitude outside, so an approach to check may be to decrease your search radius by half, for example, to see whether you get more accurate results.

There's been some internal movement on the bug but I don't believe that anything has been pushed to production to address this yet, sorry.

2 years 10 weeks ago
AppQuickie
@AppQuickie Quickie

bleh, in 2 weeks it will be 4 months since this issue was acknowledged and still no answer in sight. Way to go Twitter! It was a big mistake to invest time and money in an app based on Twitter.

2 years 9 weeks ago
kurrik
@kurrik Arne Roomann-Kurrik

Have you tried the workaround I suggested? Is it at all effective?

2 years 9 weeks ago
clayosborn
@clayosborn Clayton Osborn

The only way to get more results inside the radius is to "increase" the size of the radius, not half it. And doing so only creates an illusion that the search API just worked.

Radius is completely ignored. No amount of changing the size of it will effect the 100 tweets that will be returned. You will still receive the top 100, most recently posted, tweets from the placeId your lat/long is inside of.

Try this query:
http://search.twitter.com/search.json?q=&geocode=42.963866000000003,-85.529331999999997,1.00mi&rpp=100&result_type=recent

If the API was working you'd see all 100 results hit within that 1 mile search, all tweets originating from me and my co-coworkers, as we have tweeted far more than 100 times over that exact spot for the last few months. You probably won't see a single one though, instead getting 100 tweets from other areas all over the city of Grand Rapids. If you see any tweets at all, look at the time they were posted, it will be a few minutes ago. Check back later, they too will be gone with the rest of them, never to be found again by a geocode search, because there are 100+ more newer posts in the general area outside the radius that will be returned first.

If I know of 80 tweets that do indeed exist on a certain location, doing a geocode search of that location should return exactly those 80 tweets and only those 80 tweets (barring drop off of ones older than a few weeks, or whatever that cut-off is set at for Search results).

100 results outside the radius, the API is not working.
5 of my tweets inside the radius and 95 others outside the radius, the API is not working.
Hell, even all 80 tweets showing up in the results and 20 others outside the radius showing up still would be a bug... but I would gladly accept those results since I can always filter out the unwanted 20.

Using my location, Grand Rapids, MI, as an example....a geocode, lat/long, search, currently does nothing more than the following:
* You're lat/long is near Grand Rapids.
* Return the all tweets from Grand Rapids, sorted by most recently posted.

That's it. Posts falling inside the radius are purely coincidental. I'll get my latest tweet, not because it's inside the radius, but only because it is the most recent tweet in all of Grand Rapids.

I don't know if any of this helps or not. I've repeated a lot of this in a few discussions, but have only ever seen responses that look to me like results of the illusion I first eluded to at the beginning of this little book of a post I just wrote.

2 years 9 weeks ago
AppQuickie
@AppQuickie Quickie

we work with very small radius, upto a mile, even with reduced radius the results are incorrect. A fix would be much appreciated rather than ineffective workarounds.

2 years 9 weeks ago
TheoJohnson1
@TheoJohnson1 Theo Johnson

Get the same results by decreasing the radius. The same most recent posts assigned to the nearest 'place' to my search location.

2 years 9 weeks ago
kurrik
@kurrik Arne Roomann-Kurrik

Is "place" a guess, or have you verified this with code? An executable test case would be helpful.

2 years 8 weeks ago
TheoJohnson1
@TheoJohnson1 Theo Johnson

"place" is a theory based on observed results. I don't work for Twitter, so of course there is going to be no code to verify this with other than my app parsing having to parse through the erroneous and irrelevant results returned by the search API (i.e. throwing 95% - 100% of all results ever received away because they have nothing to do with the search I just conducted).

If you click on any location-based tweet in a feed on the Twitter website you'll see a mini map with your location and a highlighted red area. This area, I believe, is what your API refers to as a "place". The Rest API has functions to get info on places as well, hence my use of the term "place". The search results returned by every geocode query since the bug started seems to be returning results within the whole of these areas, instead of within your designated radius.

For an executable test case, see any and every single link anyone has posted as proof to you thus far. They all demonstrate the problem, and I'm not sure how the problem can be explained any clearer, and why we're being asked for more test cases when so many perfectly good ones already exist. This is just sounding more and more like stalling tactics rather than actual troubleshooting.

2 years 8 weeks ago
pnewcombeapp
@pnewcombeapp Peter

I have not noticed any changes over the past few months in my geocode-based search scenarios. Results from searches seem to often be approximated to 'places' that intersect the search radius rather than the exact dimensions of the radius as was the case previously.

2 years 9 weeks ago
AppQuickie
@AppQuickie Quickie

Is this a miracle? My results seem to have tweets just from the specified radius, not outside! Can anyone pinch me by validating this?

2 years 8 weeks ago
clayosborn
@clayosborn Clayton Osborn

Yes, the API appears to have started working normally as of last night. Issue #98 for this is still marked as aknowledged, and was never updated to a status of "In Progress" at any time throughout this entire ordeal, however, so I'm hoping that this was a purposeful fix to the functionality.

2 years 8 weeks ago
episod
@episod Taylor Singletary

Over in Geocode search volume lower than expected I asked some questions last night after we made some fixes to clarify whether this resolved things -- prefer not to update the ticket until I have a good sense of the outcome. Looks like the fixes have made a big difference.

2 years 8 weeks ago
clayosborn
@clayosborn Clayton Osborn

Good to hear. Wish I would have subscribed to that thread as well. Thank you for the fix, everything is looking good again.

2 years 8 weeks ago
AppQuickie
@AppQuickie Quickie

Okay, my excitement was short-lived, doesn't seem like the issue is completely resolved. Now it has become intermittent.

I am using this query

http://search.twitter.com/search.json?geocode=12.902714448185623,77.60616040467157,1.0km&since_id=173075166648205313&result_type=recent&page=1&rpp=100

At times I get over 90 results
https://snap.apigee.com/yBzLCs

Other times I get no result
https://snap.apigee.com/yKu6F6

I have created a new issue (pending approval)
https://dev.twitter.com/discussions/6040

2 years 8 weeks ago
episod
@episod Taylor Singletary

There were likely some geo caching issues still not fully purged due to not yet reaching a TTL -- now that some more time has passed, are you still seeing this issue persist?

2 years 8 weeks ago
episod
@episod Taylor Singletary

Revising this, we now believe that the occasional nature of this is more indicative of a time out due to the complexity of servicing the query. We should be returning you a 502 in this case instead of an empty set (and the 502 would be considered normal operation). I'll begin the process here of having this condition map to a 502 as is more appropriate. Thanks.

2 years 8 weeks ago
AppQuickie
@AppQuickie Quickie

Hi, it is still doing this, I'm sure it doesn't take 4 days to deploy in all server instances

2 years 7 weeks ago
mikeflores2000
@mikeflores2000 Mike Flores

I appreciate the passion shown on this issue. I found something of interest in the documentation:

https://dev.twitter.com/search/apachesolr_search/Oddities

1 year 49 weeks ago
ClaytonOsborn
@ClaytonOsborn Clayton Osborn

Same issue is back. This was broken for months, had been fixed and working for a couple months, and is now happening again (hopefully not for months again this time).

1 year 49 weeks ago
mikeflores2000
@mikeflores2000 Mike Flores

Found this link in the documentation:
https://dev.twitter.com/search/apachesolr_search/Oddities

Oddities

The Search API has been built up over time and some of it's original behaviors are maintained. This behaviors can lead to confusion if you don't expect them.

Location data is only included if the query includes the geocode parameter, and the user Tweeted with Geo information. When conduction Geo searches, the Search API will:

Attempt to find Tweets which have a place or lat/long within the queried geocode.
Attempt to find Tweets created by users whose profile location can be reverse geocoded into a lat/long within the queried geocode.
This means it is possible to receive Tweets which do not include a latitude or longitude.

If a Tweet contains a URL, the Search API will match to the fully expanded version of the URL. This means a keyword search may return results which appear to not include the keywords you provide. Instead, the keyword is in the expanded version of the URL in the Tweet.

The user_ids in the Search API responses do not match those for Twitter.com by default. Use the with_twitter_user_id=true parameter to return IDs which match Twitter.com.

1 year 49 weeks ago