What is the best way of detecting that a Delphi TWebBrowser web page has changed since I last displayed it?
Asked Answered
R

2

6

I want to display a 'news' page in a form using Deplhi TWebBrowser. The news page is a simple HTML page which we upload to our website from time to time and may be output from various tools. The display is fine but I'd like to know in my app whether it has changed since I last displayed it, so ideally I'd like to get either its modified date/time or its size / checksum. Precision is not important and ideally should not rely on properties that might fail because 'simple' tools were used to edit the HTML file such as NotePad. Checking on the web there are several document modified java calls but I really dont know where to start with those. I've looked through the numerous calls in Delphi's Winapi.WinInet unit and I see I can fetch the file with HTTP to examine it but that seems like cracking a walnut with a sledgehammer. I also cannot see any file date time functionality which makes me think I'm missing something obvious. I'm using Delphi XE5. In which direction should I be looking please? Thanks for any pointers.

Ruddy answered 11/8, 2014 at 9:38 Comment(5)
Do you mean Java or Javascript? Does the specific site you are interested in (which is known to you but not us) have any mechanism for live updating?Nondescript
Depending on the server/site a HTTP head request, could give you back the Last-Modified / Content-Length headers.Giliane
@David: I'm on weak ground with this site stuff but we keep it very simple - uploading changed HTML pages as required, so nothing clever.Ruddy
OK, so you are in control of both sides of this. You ought to state that in the question.Nondescript
@David: Thanks, so noted. I've also specified that I'd like to tolerate any kind of editing that may have been done on the file, e.g. NotePad.Ruddy
R
1

Thanks to a mixture of suggestions and pointers from kobik, David and TLama, I realised that I actually did need a sledgehammer and I finally came up with this solution (and I'm probably not the first, or the last!). I had to read the file contents because this did seem a better way of detecting changes. The code below calls "CheckForWebNewsOnTimer" from a TTimer infrequently and uses Indy to read the news page, make an MD5 hash of its contents and compare that with a previous hash stored in the registry. If the contents change, or 120 days elapses, the page pops up. The code has wrinkles, for example a change to a linked image on the page might not trigger a change but hey, its only news, and text almost always changes too.

function StreamToMD5HashHex( AStream : TStream ) : string;
// Creates an MD5 hash hex of this stream
var
  idmd5 : TIdHashMessageDigest5;
begin
  idmd5 := TIdHashMessageDigest5.Create;
  try
    result := idmd5.HashStreamAsHex( AStream );
  finally
    idmd5.Free;
  end;
end;



function HTTPToMD5HashHex( const AURL : string ) : string;
var
  HTTP : TidHTTP;
  ST : TMemoryStream;
begin
  HTTP := TidHTTP.Create( nil );
  try
    ST := TMemoryStream.Create;
    try
      HTTP.Get( AURL, ST );
      Result := StreamToMD5HashHex( ST );
    finally
      ST.Free;
    end;
  finally
    HTTP.Free;
  end;
end;




function ShouldShowNews( const ANewHash : string; AShowAfterDays : integer ) : boolean;
const
  Section = 'NewsPrompt';
  IDHash  = 'LastHash';
  IDLastDayNum = 'LastDayNum';
var
  sLastHash : string;
  iLastPromptDay : integer;
begin


  // Check hash
  sLastHash := ReadRegKeyUserStr( Section, IDHash, '' );
  Result := not SameText( sLastHash, ANewHash );
  if not Result then
    begin
    // Check elapsed days
    iLastPromptDay := ReadRegKeyUserInt( Section, IDLastDayNum, 0 );
    Result := Round( Now ) - iLastPromptDay > AShowAfterDays;
    end;

  if Result then
    begin
    // Save params for checking next time.
    WriteRegKeyUserStr( Section, IDHash, ANewHash );
    WriteRegKeyUserInt( Section, IDLastDayNum, Round(Now) );
    end;
end;





procedure CheckForWebNewsOnTimer;
var
  sHashHex, S : string;
begin
  try
    S := GetNewsURL; // < my news address
    sHashHex := HTTPToMD5HashHex( S );
    If ShouldShowNews( sHashHex, 120 {days default} ) then
      begin
      WebBrowserDlg( S );
      end;

  except
    // .. ignore or save as info
  end;
end;
Ruddy answered 12/8, 2014 at 8:23 Comment(0)
G
4

You could use Indy TIdHTTP to send a HEAD request and examine Last-Modified / Content-Length headers.

e.g.:

procedure TForm1.Button1Click(Sender: TObject);
var
  Url: string;
  Http: TIdHTTP;
  LastModified: TDateTime;
  ContentLength: Integer;  
begin
  Url := 'http://yoursite.com/newspage.html';
  Http := TIdHTTP.Create(nil);
  try
    Http.Head(Url);
    LastModified := Http.Response.LastModified;
    ContentLength := Http.Response.ContentLength;
    ShowMessage(Format('Last-Modified: %s ; Content-Length: %d', [DateTimeToStr(LastModified), ContentLength]));
  finally
    Http.Free;
  end;
end;

When the TWebBrowser.DocumentComplete event is fired make a HEAD request and store LastModified and ContentLength variables. Then periodically make HEAD requests to test for changes (via TTimer for example).

These Header parameters are dependent on the web server implementation, and may not return file system date-time on the server (dynamic pages for example). your server might not result back these parameters at all.

For example, with static HTML pages on IIS, Last-Modified returns the file system last modified date-time, which is what you want.

For dynamic content (e.g. php, asp, .NET etc..), if you control the web-server, you might as well add your own custom HTTP response header on the server side to indicate the file system date-time (e.g. X-Last-Modified) or set the response Last-Modified header to your needs and examine this header on the client side.


If you need to examine/hash the entire HTTP content, you need to issue a GET method: http.Get(URL)

Giliane answered 11/8, 2014 at 11:3 Comment(6)
Thanks. Would LastModified actually access the file system date-time? For example, if someone modified the file with ANY tool, does that return the modified time? (like Delphi's FileAge would on a local disk).Ruddy
@Brian, it all depends on the server you have. Your server might not return you the mentioned header fields, nor even respond you to the HEAD request.Sweptwing
@Sweptwing So presumably I could use Kobik's suggestion above to fetch the entire page as a stream and use a hash instead?Ruddy
@Brian, fetching the page would not fit to your cracking a walnut with a sledgehammer requirement. If you can use what is described in this post really depends on what your server (which we don't know) can do. If it's your own written server, then you can surely extend it to return the header fields mentioned here as a response to the HEAD request. That is the right way. And I believe that e.g. Apache will be configurable that way as well (just a guess).Sweptwing
@TLama: It's just an FTP site with a bunch of files that we drag and drop up there :-)Ruddy
@BrianFrost, for static HTML pages on IIS Last-Modified header returns the file system date-time.Giliane
R
1

Thanks to a mixture of suggestions and pointers from kobik, David and TLama, I realised that I actually did need a sledgehammer and I finally came up with this solution (and I'm probably not the first, or the last!). I had to read the file contents because this did seem a better way of detecting changes. The code below calls "CheckForWebNewsOnTimer" from a TTimer infrequently and uses Indy to read the news page, make an MD5 hash of its contents and compare that with a previous hash stored in the registry. If the contents change, or 120 days elapses, the page pops up. The code has wrinkles, for example a change to a linked image on the page might not trigger a change but hey, its only news, and text almost always changes too.

function StreamToMD5HashHex( AStream : TStream ) : string;
// Creates an MD5 hash hex of this stream
var
  idmd5 : TIdHashMessageDigest5;
begin
  idmd5 := TIdHashMessageDigest5.Create;
  try
    result := idmd5.HashStreamAsHex( AStream );
  finally
    idmd5.Free;
  end;
end;



function HTTPToMD5HashHex( const AURL : string ) : string;
var
  HTTP : TidHTTP;
  ST : TMemoryStream;
begin
  HTTP := TidHTTP.Create( nil );
  try
    ST := TMemoryStream.Create;
    try
      HTTP.Get( AURL, ST );
      Result := StreamToMD5HashHex( ST );
    finally
      ST.Free;
    end;
  finally
    HTTP.Free;
  end;
end;




function ShouldShowNews( const ANewHash : string; AShowAfterDays : integer ) : boolean;
const
  Section = 'NewsPrompt';
  IDHash  = 'LastHash';
  IDLastDayNum = 'LastDayNum';
var
  sLastHash : string;
  iLastPromptDay : integer;
begin


  // Check hash
  sLastHash := ReadRegKeyUserStr( Section, IDHash, '' );
  Result := not SameText( sLastHash, ANewHash );
  if not Result then
    begin
    // Check elapsed days
    iLastPromptDay := ReadRegKeyUserInt( Section, IDLastDayNum, 0 );
    Result := Round( Now ) - iLastPromptDay > AShowAfterDays;
    end;

  if Result then
    begin
    // Save params for checking next time.
    WriteRegKeyUserStr( Section, IDHash, ANewHash );
    WriteRegKeyUserInt( Section, IDLastDayNum, Round(Now) );
    end;
end;





procedure CheckForWebNewsOnTimer;
var
  sHashHex, S : string;
begin
  try
    S := GetNewsURL; // < my news address
    sHashHex := HTTPToMD5HashHex( S );
    If ShouldShowNews( sHashHex, 120 {days default} ) then
      begin
      WebBrowserDlg( S );
      end;

  except
    // .. ignore or save as info
  end;
end;
Ruddy answered 12/8, 2014 at 8:23 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.