Anyone know how to get the number of revisions of a wikipedia page using mediawiki API?
I have read this API documentation, but can't find the related API:
Revision API
The only possibility is to retrieve all revisions and count them. You might need to continue the query for that.
Bug 17993 is about including a count, but is still unsolved.
Here is code to get number of revisions of a page (in this case, the JSON wiki page):
import requests
BASE_URL = "http://en.wikipedia.org/w/api.php"
TITLE = 'JSON'
parameters = { 'action': 'query',
'format': 'json',
'continue': '',
'titles': TITLE,
'prop': 'revisions',
'rvprop': 'ids|userid',
'rvlimit': 'max'}
wp_call = requests.get(BASE_URL, params=parameters)
response = wp_call.json()
total_revisions = 0
while True:
wp_call = requests.get(BASE_URL, params=parameters)
response = wp_call.json()
for page_id in response['query']['pages']:
total_revisions += len(response['query']['pages'][page_id]['revisions'])
if 'continue' in response:
parameters['continue'] = response['continue']['continue']
parameters['rvcontinue'] = response['continue']['rvcontinue']
else:
break
print parameters['titles'], total_revisions
You can check the result here: https://en.wikipedia.org/w/index.php?title=JSON&action=info#Edit_history
(accessible from the corresponding wikipedia page sidebar: Tools - Page information)
Retrieve the revisions and implement a method to count them (It's just XML).
api.php ? action=query & prop=revisions & titles=API|Main%20Page & rvprop=timestamp|user|comment|content
.
<api>
<query>
<pages>
<page pageid="1191" ns="0" title="API">
<revisions>
<rev user="Harryboyles" timestamp="2006-10-31T05:39:01Z" comment="revert unexplained change: see talk ...">
...content...
</rev>
</revisions>
</page>
<page pageid="11105676" ns="0" title="Main Page">
<revisions>
<rev user="Ryan Postlethwaite" timestamp="2007-06-26T19:05:06Z" comment="rv - what was that for?">
...content...
</rev>
</revisions>
</page>
</pages>
With REST API provided by newer MediaWikis, you can use the "Get page history counts" API to get the number of revisions of a page.
For example,
GET https://en.wikipedia.org/w/rest.php/v1/page/Jupiter/history/counts/edits?from=384955912&to=406217369
this request will return a JSON response like the following:
{
"count": 110,
"limit": false
}
Zero coding at all.
As stated in Wint's answer, the best solution is probably to use the REST API.
Though, if you have to use the usual Action API, your only solution is to count the revisions (which obviously will be slow on pages with large histories).
I have just crafted a JavaScript code for this:
/* jshint esversion: 6 */
/* globals Promise, mw */
function countRevisions( pageTitle ) {
return new Promise( function ( resolve, reject ) {
mw.loader.using( 'mediawiki.api', function () {
const api = new mw.Api();
const userGroups = mw.config.get( 'wgUserGroups' );
const apiLimit = userGroups.includes( 'sysop' ) || userGroups.includes( 'bot' ) ? 5000 : 500;
let count = 0;
function makeRequest( apiContinue ) {
const params = {
action: 'query',
prop: 'revisions',
titles: pageTitle,
rvprop: '', // we don't need any property
rvlimit: apiLimit,
formatversion: 2,
};
if ( apiContinue ) {
Object.assign( params, apiContinue );
}
api.get( params ).done( function ( data ) {
if ( !data.query ) {
reject();
return;
}
const revisions = data.query.pages[ 0 ].revisions;
if ( revisions ) {
count += revisions.length;
}
if ( data[ 'continue' ] ) {
makeRequest( data[ 'continue' ] );
} else {
resolve( count );
}
} ).fail( function () {
reject();
} );
}
makeRequest();
} );
} );
}
countRevisions( 'Page title' ).then( function ( count ) {
/* ... */
} );
© 2022 - 2024 — McMap. All rights reserved.