Investigate different alternatives to get the TimeLine of a Tweeter account
From Twitter documentation official libraries
- HorseBird Client consuming Twitter's Streaming API. Uses internally twitter4j to get timeline
-
Github library to use Advanced Search
- PROS: No limitation of number of tweets
- CONS: Not all the tweets are present. See documentation

-
Github library to use timeline API
https://twitter.com/i/search/timeline?&q=from:LetGo&f=tweets- After testing it, it is not parsing correctly timeline
-
Custom library calling when using ScrollDown feature and scrapping HTML elements
- PROS: No limitation in number of requests
- CONS: Limitation of number from 800 to 900 tweets
-
twitter4j is an unofficial Java library for the Twitter API.
-
For this project to retrieve timeline of a given user, I used twitter4j library
Get Timeline of user has restrictions:
Response formats JSON Requires authentication? Yes Rate limited? Yes Requests / 15-min window (user auth) 900 Requests / 15-min window (app auth) 1500 -
Used auth key
If you want to use your own Twitter authentication keys, set as JMV parameters:
OAuthConsumerSecretOAuthConsumerKeyOAuthAccessTokenOAuthAccessTokenSecretas-DOAuthConsumerSecret=XXXXX -DOAuthConsumerKey=XXXXX -DOAuthAccessToken=XXXXX -DOAuthAccessTokenSecret=XXXXX
Or change twitter.properties
-
In order to reduce requests to Twitter, Twitter provides Pagination feature
The maximum in paging is 1000
But when using twitter4j the maximum is 200, the documentation states it for performance reasons in the deprecated method getUserTimeline
-
MainServerTestcould use more strict validation of the output. Json Schema Validator could have been used.
Don’t!
- Violate these or other policies.
- Be extra mindful of our rules about abuse and user privacy.
- Abuse the Twitter API or attempt to circumvent rate limits.
- Use non-API-based forms of automation, such as scripting the Twitter website. The use of these techniques may result in the permanent suspension of your account.
Restrictions in timelines Twitter API
This method can only return up to 3,200 of a user’s most recent Tweets. Native retweets of other statuses by the user is included in this total, regardless of whether
include_rtsis set to false when requesting this resource.Resource URL https://api.twitter.com/1.1/statuses/user_timeline.json
Resource Information
Response formats JSON
Requires authentication? Yes
Rate limited? Yes
Requests / 15-min window (user auth) 900
Requests / 15-min window (app auth) 1500
Because I did not provide a phone number:

-
The
main()method is in MainServer.java which spawns a http server using exclusively classes inside the JDK, the reason is to tune up a policy for request overflow. -
The operation to retrieve the tweets is idempotent therefore GET is used
-
If we want to create new endpoints we will create a new class implementing Handler interface and adding the logic to trigger in FactoryHandler
-
Tweets can be locale sensitive to identify the language in order to represent it, Locale class is used.
-
Since no information is stored, no need to monitor the memory, but if cache is implemented, overflow error can lead to Memory leaks
-
Handling exception: getUserTimeline throws a checked exception TwitterException and it is handled in MainServer#initizalizeContext() but it could throw an unchecked exception, this is the reason that
} catch (Exception e) {is handled inside this method. -
Use UTF-8 to decode chars, Emoti will be represented but not Japanese Kanji
-
If request is not GET with the correct parameters, the reason won't be shown
-
Used system.out.println when error as max memory reached or unchecked exception caught but in further steps use logging mechanism
-
The code was implemented using TDD, divided first into 2 modules creating the Server and creating the Request Handler and then joining using Integration Tests
-
Unit test can run without Internet connectivity, regression test need it
-
Test using Non official and not supported Search Advanced library fails erratically. See test AdvancedSearchAPITest.java
searchNumberTweetsByUser_3201tweets()org.json.JSONException: JSONObject["min_position"] not a string. at org.json.JSONObject.getString(JSONObject.java:725) at me.jhenrique.manager.TweetManager.getTweets(TweetManager.java:81) at twitter.AdvancedSearchAPI.searchNumberTweetsByUser(AdvancedSearchAPI.java:39) at integration.twitter.AdvancedSearchAPITest.searchNumberTweetsByUser_3199tweets(AdvancedSearchAPITest.java:65) java.lang.AssertionError:
Expected size:<3199> but was:<157> in: ...
-
Load testing using JMETER is a iteration of a simple request with same username and tweet_number inside the project
Uses cases when cache is invalidated:
- A user is disabled/blocked/removed
- A new Tweet is created
- A user removes a tweet
Possible ways to solve it:
-
If invalid user: Use Twitter API to look up for a username, if it does not exists, evict the cache and return empty response.
-
If new tweets created: Use implemented library to scrape API ScrollDown does not have request limit, to validate that the first 5 tweets are in the top of the cache:
- If not included: Use the getUserTimeline and append the tweets to the cache
- If included: Return the latest
num_tweetsin the cache
-
If tweet is removed: Use Account activity Twitter API to subscribe to a user activity. Since the service that allows to get the activity for any user is a Enterprise service that requires a paid subscription In case we have access to an Enterprise Subscription API
- If user not in cache: Add it to the cache and store the retrieve latest tweets from getUserTimeline Twitter API and the time of the latest activity. Then subscribe to the user activity
- If user in cache:
- Check the activity from the subscription of the given user.
- If user has deleted tweet from the last
- Update time of the latest activity
A free price alternative is to use Twitter Stream to retrieve deleted tweets, Stackoverflown answer but Streaming Twitter API is deprecated and the documentation is not available
- Use the Twitter API to obtain the API rate limit status before querying the timeline API
- Using a proper logging mechanism to triage errors
- The client can throttle its requests if the server adds more info when a request is not success:
- Quota limit reached
- Http request parameters not correct
- User does not exist
- Java version: 8