Getting exact domain name from any URL [duplicate]
Asked Answered
V

5

25

I need to extract the exact domain name from any Url.

For example,

Url : http://www.google.com --> Domain : google.com

Url : http://www.google.co.uk/path1/path2 --> Domain : google.co.uk

How can this is possible in c# ? Is there a complete TLD list or a parser for that task ?

Voidable answered 12/5, 2011 at 20:55 Comment(0)
T
31

You can use the Uri Class to access all components of an URI:

var uri = new Uri("http://www.google.co.uk/path1/path2");

var host = uri.Host;

// host == "www.google.co.uk"

However, there is no built-in way to strip the sub-domain "www" off "www.google.co.uk". You need to implement your own logic, e.g.

var parts = host.ToLowerInvariant().Split('.');

if (parts.Length >= 3 &&
    parts[parts.Length - 1] == "uk" &&
    parts[parts.Length - 2] == "co")
{
    var result = parts[parts.Length - 3] + ".co.uk";

    // result == "google.co.uk"
}
Thanatos answered 12/5, 2011 at 20:57 Comment(0)
M
17

Use:

new Uri("https://mcmap.net/q/525948/-getting-exact-domain-name-from-any-url-duplicate?s=45faab89-43eb-41dc-aa5b-8a93f2eaeb74#new-answer").GetLeftPart(UriPartial.Authority).Replace("/www.", "/").Replace("http://", ""));

Input:

https://mcmap.net/q/525948/-getting-exact-domain-name-from-any-url-duplicate?s=45faab89-43eb-41dc-aa5b-8a93f2eaeb74#new-answer

Output:

stackoverflow.com

Also works for the following.

http://www.google.com → google.com

http://www.google.co.uk/path1/path2 → google.co.uk

http://localhost.intranet:88/path1/path2 → localhost.intranet:88

http://www2.google.com → www2.google.com

Maleki answered 7/12, 2012 at 10:57 Comment(3)
It will not work, new Uri("http://www.google.com").GetLeftPart(UriPartial.Authority).Replace("http://", "") ≡ "www.google.com"Lietuva
See updated. Thanks for noticing.Maleki
For a different scheme, you can use like this: Uri uri = new Uri(url); string domain = uri.GetLeftPart(UriPartial.Authority).Replace("/www.", "/").Replace(uri.GetLeftPart(UriPartial.Scheme), "");Towle
S
7

Try the System.Uri class.

http://msdn.microsoft.com/en-us/library/system.uri.aspx

new Uri("http://www.google.co.uk/path1/path2").Host

which returns "www.google.co.uk". From there it's string manipulation. :/

Strobila answered 12/5, 2011 at 20:57 Comment(0)
G
3

use:

var uri =new Uri(Request.RawUrl); // to get the url from request or replace by your own
var domain = uri.GetLeftPart( UriPartial.Authority );

Input:

Url = http://google.com/?search=true&q=how+to+use+google

Result:

domain = google.com 
Gallous answered 27/4, 2012 at 17:44 Comment(3)
does not work, RawUrl does not return .com addressEpicardium
@Nick: see the next line tooGallous
I used your entire code. This solution may have worked in the past but did not work for .net 4.Epicardium
L
1

Another variant, without dependencies:

string GetDomainPart(string url)
{
    var doubleSlashesIndex = url.IndexOf("://");
    var start = doubleSlashesIndex != -1 ? doubleSlashesIndex + "://".Length : 0;
    var end = url.IndexOf("/", start);
    if (end == -1)
        end = url.Length;

    string trimmed = url.Substring(start, end - start);
    if (trimmed.StartsWith("www."))
        trimmed = trimmed.Substring("www.".Length );
    return trimmed;
}

Examples:

http://www.google.com → google.com

http://www.google.co.uk/path1/path2 → google.co.uk

http://localhost.intranet:88/path1/path2 → localhost.intranet:88

http://www2.google.com → www2.google.com

Lietuva answered 30/4, 2014 at 9:4 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.