How can I decode/recreate Google Flights Search URLs?
Asked Answered
P

3

10

The Problem

On Google Flights, search information is encoded in a URL parameter, presumably so users can share flight searches with each other easily. The URL format looks like this:

https://www.google.com/travel/flights/search?tfs=CBwQAhoeagcIARIDSE5MEgoyMDIxLTA5LTEzcgcIARIDU0ZPGh5qBwgBEgNTRk8SCjIwMjEtMDktMTdyBwgBEgNITkxwAYIBCwj___________8BQAFIAZgBAQ

I am trying to write a program that can generate flight search URLs given flight information (origin, destination, flight dates, passengers, etc). To do this I need know how the information is encoded in the URL so I can recreate it.

What I've tried

I know that the flight info is encoded in base64 or some variant of it (I've been using base64decode.org for testing). For a round-trip flight from HNL-SFO on 2021-09-13 - 2021-09-17, Google Flights has this URL:

https://www.google.com/travel/flights/search?tfs=CBwQAhoeagcIARIDSE5MEgoyMDIxLTA5LTEzcgcIARIDU0ZPGh5qBwgBEgNTRk8SCjIwMjEtMDktMTdyBwgBEgNITkxwAYIBCwj___________8BQAFIAZgBAQ

The part of the tfs query parameter before the underscores decodes to

jHNL
2021-09-13rSFOjSFO
2021-09-17rHNLp

which contains some (but not all) recognizable flight info. What I don't understand is the whitespace between the recognizable information. Using this site, I learned that the whitespace is a mix of characters:

U+0008 : <control> BACKSPACE [BS]
U+001C : <control> INFORMATION SEPARATOR FOUR {file separator (FS)}
U+0010 : <control> DATA LINK ESCAPE [DLE]
U+0002 : <control> START OF TEXT [STX]
U+001A : <control> SUBSTITUTE [SUB]
U+001E : <control> INFORMATION SEPARATOR TWO {record separator (RS)}
U+006A : LATIN SMALL LETTER J
U+0007 : <control> BELL [BEL]
U+0008 : <control> BACKSPACE [BS]
U+0001 : <control> START OF HEADING [SOH]
U+0012 : <control> DEVICE CONTROL TWO [DC2]
U+0003 : <control> END OF TEXT [ETX]
U+0048 : LATIN CAPITAL LETTER H
U+004E : LATIN CAPITAL LETTER N
U+004C : LATIN CAPITAL LETTER L
...

This suggests that I'm not decoding the data properly. I've tried some other variants of base64, but haven't had any luck.

Does anyone know how this info is encoded? Another thing I haven't been able to figure out is how the information after the underscores (8BQAFIAZgBAQ) is encoded. Based on the behavior of the Google Flights site, I think it encodes passenger information, but it base64 decodes to only whitespace characters.

Additional Context

Two years ago I made a working version of the program which produced URLs like

https://www.google.com/flights?hl=en#flt=ORD.MCO.2021-07-16*MCO.ORD.2021-07-19;c:USD;e:1;px:2,2,0,0;sd:1;t:f

Several months ago Google changed the format they use from the above to the encoded version. I want to figure out how to recreate the encoded URLs so I can update my program instead of retiring it.

Punishable answered 27/8, 2021 at 22:28 Comment(0)
S
8

You can have your program output flight URLs in query format using the q URL param. No need to encode/decode the URL.

For example:

https://www.google.com/travel/flights?q=Flights%20to%20SFO%20from%20HNL%20on%202022-09-13%20through%202022-09-17

Which leads to the results page: HNL <> SFO Flight Results

Edit (2023-09-26):

Despite some comments, this method still works. Make sure you're formatting as a natural language search string:

"flights to X from Y on 1234-12-34 through 1234-12-34 2 seats on Airline"

https://www.google.com/travel/flights?q=Flights%20to%20SFO%20from%20HNL%20on%202023-12-13%20through%202023-12-17%202%20seats%20on%20united

HNL <> SFO on United Flight Results

Salaidh answered 18/1, 2022 at 23:2 Comment(17)
Thanks for the tip! Do you know if it's possible to specify passenger count as well? Is there a specification or guide for this anywhere?Punishable
I've figured out how to do passengers and fare class: E.g ResultsPunishable
That's correct, just keep building out the query string with the parameters you want to prefill. I'm not sure if all of them are controllable, but all the ones I've needed work via this query format.Salaidh
Thanks, you've been a great help. Do you know whether there's a phrase that can control the number of infants in seats? "Infants" seems to map to infants in lap. That's the only one I've needed that I can't figure out.Punishable
Since the seated infant is technically purchasing a seat on most airlines, I think it'd be safe to map it to "1 child" instead. I haven't found anything that shows a difference between a "seated infant" vs "child" in this context.Salaidh
That's what I ended up doing. Works well enough!Punishable
@Salaidh I'm curious how you figured out this url!?Swane
Essentially just took the natural language search and found that part wasn't encoded. Then extrapolated from that all the fields needed for flights, hotels, etc.Salaidh
@Punishable Can anyone please share more param names, suchas the param name that is used to specify passengers, cabin-class such as Economy or Business, etc.Stumpy
This does not work anymore...Gratitude
Did anyone found out how to filter specific airlines using the q parameter?Alti
Also not working for me ....Kilmarx
Anyone has any idea what works now?Kilmarx
This method still works, make sure you're formatting as a natural language search string: "flights to X from Y on 1234-12-34 through 1234-12-34 2 seats on Airline" https://www.google.com/travel/flights?q=Flights%20to%20SFO%20from%20HNL%20on%202023-12-13%20through%202023-12-17%202%20seats%20on%20united HNL <> SFO on United Flight ResultsSalaidh
Oddly your samples don't work anymore, but I was able to do it manually google.com/travel/…Abjuration
Samples still work, the dates are just in the past so it seems like it doesn't work. Just update the dates in the URL to be in the future and try again.Salaidh
Did someone figure out how to get the departure times in the URL? Thanks for all the help so far! Super helpful.Anomalism
A
0

Google's Bard is able to decode this

My Question: what is this google flights url encoded with? https://www.google.com/travel/flights/search?tfs=CBwQAhopEgoyMDIzLTA5LTA5ag0IAhIJL20vMDJfMjg2cgwIAhIIL20vMDdkZmsaKRIKMjAyMy0wOS0xOGoMCAISCC9tLzA3ZGZrcg0IAhIJL20vMDJfMjg2QAFIA3ABggELCP___________wGYAQE

Bard's response
The Google Flights URL you provided is encoded with the URL-safe Base64 format. This is a way of encoding data into a string that can be transmitted over a network without being corrupted. The encoded string can then be decoded back into its original form.
To decode the URL, you can use a Base64 decoder. There are many online Base64 decoders available, or you can use a programming language that supports Base64 decoding.
Here is the decoded string:
tfs=origin=SFO&destination=LAX&departure_date=2023-06-06&return_date=2023-06-13&adults=1&children=0&infants=0&cabin=economy
This string tells Google Flights to search for flights from SFO to LAX on June 6, 2023, returning on June 13, 2023. The search is for economy class flights for 1 adult.
You can use this string to search for flights on Google Flights. Just copy and paste the string into the search bar on the Google Flights website.

That said trying to use any Base64 decode logic(online or python) results in gibberish.

The only reproducible/legible URLs I've been able to get is by using the query format listed in the accepted answer.

Some examples

One way

https://www.google.com/travel/flights?q=Flights%20to%20TYO%20from%20JFK%20on%202023-09-09%20oneway&curr=USD

1 adult 2 children in first class

https://www.google.com/travel/flights?q=Flights%20to%20SFO%20from%20HNL%20on%202023-09-13%20through%202023-09-17%20with%20one%20adult%20and%20two%20children%20first%20class&curr=USD

1 adult FRA to SFO in Business

https://www.google.com/travel/flights?q=Flights%20to%20SFO%20from%20FRA%20on%202023-09-13%20through%202023-09-17%20with%20one%20adult%20business%20class&curr=USD

the only problem I'm having is premium economy. I tried premium%20economy, premium+economy, premium_economy, p_economy, py and a whole bunch of others but I can't get premium economy flights to show up.

Amandie answered 5/6, 2023 at 4:17 Comment(8)
Bard's answer is incorrect. The link you demonstrated with is for a trip from New York to Tokyo on Sep 9-18, but Bard says it's for a trip from San Francisco to Los Angeles from Jun 6-13. In fact, Bard gives the same answer for any encoded Google Flights URL that I tried. It seems like Bard is hallucinating here.Punishable
Did anyone found out how to filter specific airlines using the q parameter?Alti
I haven't been able to nail down "Premium Economy" either, however, since it is a class of service that is primarily only offered on longer international flights it didn't fit my use-case and I left it alone.Salaidh
@Alti - you simply add the specific airline to that part of the search query. Note: I wasn't able to get the "Alliance" to work, but didn't spend much time on it either.Salaidh
@Salaidh can you give me an example of how you add the airline? I tried using their carrier code or their full name, but the filter didn't work for meAlti
@Alti I added an example in my accepted answer that only displays results for a specific airlines (United) hope that helps!Salaidh
@Salaidh thanks! Were you able to figure out how to select multiple airlines instead of just one? Tried with comma, and, orAlti
Did you figure out how to get the departure times in the URL?Anomalism
A
-1

I miss having the ability to encode a query and have the same question. Nice work with finding out it's in base64.

I think reverse engineering is the only way to find out how things are encoded. For example, the stuff after the underlines is most likely binary-encoded.

See the below for economy:

11101111 10111111 10111101 00010100 00000000 00010100 11101111 10111111 10111101 00011001 11101111 10111111 10111101 00010000 00010000

And the same query but for for business class

11101111 10111111 10111101 00010100 00000000 00010100 11101111 10111111 10111101 00101001 11101111 10111111 10111101 00010000 00010000

As you can see the 10th byte goes from 00011001 to 10111101

Achaemenid answered 6/10, 2021 at 9:0 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.