Get extended/full text tweets in Twitter API v2

Asked 22/8, 2020 at 19:33 Answered 22/4 at 10:12

The new Twitter v2 API was just released a couple of weeks ago, so this may just be an issue of the documentation not being done quite yet.

What I am trying to do is search recent tweets for "puppies" and return all that have some kind of media attached. However, when I run this search in Postman, not all of the returned tweets have attachments.media_keys. I noticed that the ones that do not have attachments.media_keys are tweets whose text ends in ellipses .... I understand that in the v1.1 API, this issue is solved by specifying tweet_mode=extended in the query params or tweet.fields=extended_tweet. However, these do not seem to work in the v2 API and I have not seen any documentation about getting the full text of tweets (and the associated attachments). Does anyone know how to do this in v2?

My Postman query url: "https://api.twitter.com/2/tweets/search/recent?query=has:media puppies&tweet.fields=attachments&expansions=attachments.media_keys&media.fields=duration_ms,height,media_key,preview_image_url,public_metrics,type,url,width"

In my app, I am using Node.js Axios to perform the query:

var axios = require('axios');

var config = {
  method: 'get',
  url: 'https://api.twitter.com/2/tweets/search/recent?query=has:media puppies&tweet.fields=attachments&expansions=attachments.media_keys&media.fields=duration_ms,height,media_key,preview_image_url,public_metrics,type,url,width',
  headers: { 
    'Authorization': 'Bearer {{my berarer token}}', 
  }
};

axios(config)
.then(function (response) {
  console.log(JSON.stringify(response.data));
})
.catch(function (error) {
  console.log(error);
});

Keek answered 22/8, 2020 at 19:33 Comment(0)

Great question, thank you. We’re discussing this on the Twitter Developer forums as well.

In v2 of the API we have eliminated the notion of an “extended Tweet” since we assume that all new apps understand the concept of 280 characters, so the complete text is in the Tweet text field.

The difference you’re finding is in retweets or quoted Tweets where the embedded text is truncated. This is (perhaps surprisingly) the same as v1.1 and the former premium and enterprise APIs as well. We are investigating whether to modify this, and the implications in doing so.

I don’t for any means want to take traffic away from Stack, but you might find more ongoing updates and information on our developer forums. Thanks!

Lactoprotein answered 22/8, 2020 at 21:27 Comment(1)

Awesome, so unless quoted / retweet / reply a bare tweet should be complete! – Antipasto 25/2, 2022 at 22:0

As of July, 2021, for sure this "problem" or strange behavior concerns retweets.

To get full text of a retweet while getting recent tweets for a user I did the following trick:

First I get recent tweets for a user following docs:

curl "https://api.twitter.com/2/users/2244994945/tweets?expansions=attachments.poll_ids,attachments.media_keys,author_id,entities.mentions.username,geo.place_id,in_reply_to_user_id,referenced_tweets.id,referenced_tweets.id.author_id&tweet.fields=attachments,author_id,context_annotations,conversation_id,created_at,entities,geo,id,in_reply_to_user_id,lang,possibly_sensitive,public_metrics,referenced_tweets,reply_settings,source,text,withheld&user.fields=created_at,description,entities,id,location,name,pinned_tweet_id,profile_image_url,protected,public_metrics,url,username,verified,withheld&place.fields=contained_within,country,country_code,full_name,geo,id,name,place_type&poll.fields=duration_minutes,end_datetime,id,options,voting_status&media.fields=duration_ms,height,media_key,preview_image_url,type,url,width,public_metrics,non_public_metrics,organic_metrics,promoted_metrics&max_results=5" -H "Authorization: Bearer $BEARER_TOKEN"

This is an all fields query (not all fields are necessary) but it is necessary to get ['includes']['tweets'] within the structure of the returned JSON data. This is the place where you have to look for the full text of a retweet - it is at: ['includes']['tweets'][0..n]['text] while all the recent tweets (and retweets) are found at ['data'][0..n]['text'].

Then you have to match the shortened retweets from the ['data'] with those from the ['includes']['tweets']. I do it using ['data'][n]['referenced_tweets'][0]['id'] which should match ['includes']['tweets'][m]['id]. where n and m are some indexes.

To be 100% safe you can check if ['data'][n]['referenced_tweets'][0]['id'] has a matching key/value pair: type: retweet (suggesting that this is really a retweet reference), but for me the 0 index works in all checked cases so not to complicate things more I left it this way for now :)

If that sounds complicated just dump the whole parsed JSON with all tweets and check the structure of the data.

Andra answered 20/7, 2021 at 10:49 Comment(1)

Thanks for your answer which is exactly the same thing that I'm doing right now, but another strange thing that I came across is that some tweets even in the "includes" section (the tweets that were retweeted from) have ellipses and in fact not complete. do you have any idea why is that happening? – Weaponeer 21/5, 2022 at 14:22

Great question, thank you. We’re discussing this on the Twitter Developer forums as well.

In v2 of the API we have eliminated the notion of an “extended Tweet” since we assume that all new apps understand the concept of 280 characters, so the complete text is in the Tweet text field.

I don’t for any means want to take traffic away from Stack, but you might find more ongoing updates and information on our developer forums. Thanks!

Lactoprotein answered 22/8, 2020 at 21:27 Comment(1)

Awesome, so unless quoted / retweet / reply a bare tweet should be complete! – Antipasto 25/2, 2022 at 22:0

The accepted answer is indeed correct. I needed to implement this in Python. So, I've written a code which does exactly what @Picard said.

import requests
import urllib.parse
import json

keyword_to_search = 'BMW Cars'

safe_string = urllib.parse.quote_plus(keyword_to_search)

url = "https://api.twitter.com/2/tweets/search/recent?expansions=attachments.poll_ids,attachments.media_keys,author_id,entities.mentions.username,geo.place_id,in_reply_to_user_id,referenced_tweets.id,referenced_tweets.id.author_id&tweet.fields=attachments,author_id,context_annotations,conversation_id,created_at,entities,geo,id,in_reply_to_user_id,lang,possibly_sensitive,public_metrics,referenced_tweets,reply_settings,source,text,withheld&user.fields=created_at,description,entities,id,location,name,pinned_tweet_id,profile_image_url,protected,public_metrics,url,username,verified,withheld&place.fields=contained_within,country,country_code,full_name,geo,id,name,place_type&poll.fields=duration_minutes,end_datetime,id,options,voting_status&media.fields=duration_ms,height,media_key,preview_image_url,type,url,width,public_metrics,non_public_metrics,organic_metrics,promoted_metrics&max_results=10"

payload = f'query={safe_string}'
headers = {
  'Content-Type': 'application/x-www-form-urlencoded',
  'Authorization': 'Bearer TOKEN_YOU_GOT_FROM_TWITTER',
}

response = requests.request("GET", url, headers=headers, data=payload)
parsed_response = json.loads(response.text)

outputs = []
for tweet in parsed_response['data']:
    id = tweet['id']
    text = tweet['text']
    
    # detect if it is retweeted
    retweet_id = None
    if 'referenced_tweets' in tweet:
        for referenced_tweet in tweet['referenced_tweets']:
            if referenced_tweet['type']=='retweeted':
                retweet_id = referenced_tweet['id']

    if retweet_id is not None:
        if 'includes' in parsed_response and 'tweets' in parsed_response['includes']:
            for item in parsed_response['includes']['tweets']:
                if item['id'] == retweet_id:
                    text = item['text']

    outputs.append({
        'tweet_id': id,
        'tweet_text': text
    })

outputs

Donetsk answered 22/4 at 10:12 Comment(0)

Recommended topics

Hot tags