Tag Cloud in C#
Asked Answered
D

10

16

I am making a small C# application and would like to extract a tag cloud from a simple plain text. Is there a function that could do that for me?

Dapper answered 10/12, 2008 at 0:34 Comment(0)
B
14

Building a tag cloud is, as I see it, a two part process:

First, you need to split and count your tokens. Depending on how the document is structured, as well as the language it is written in, this could be as easy as counting the space-separated words. However, this is a very naive approach, as words like the, of, a, etc... will have the biggest word-count and are not very useful as tags. I would suggest implementing some sort of word black list, in order to exclude the most common and meaningless tags.

Once you have the result in a (tag, count) way, you could use something similar to the following code:

(Searches is a list of SearchRecordEntity, SearchRecordEntity holds the tag and its count, SearchTagElement is a subclass of SearchRecordEntity that has the TagCategory attribute,and ProcessedTags is a List of SearchTagElements which holds the result)

double max = Searches.Max(x => (double)x.Count);
List<SearchTagElement> processedTags = new List<SearchTagElement>();

foreach (SearchRecordEntity sd in Searches)
{
    var element = new SearchTagElement();                    

    double count = (double)sd.Count;
    double percent = (count / max) * 100;                    

    if (percent < 20)
    {
        element.TagCategory = "smallestTag";
    }
    else if (percent < 40)
    {
        element.TagCategory = "smallTag";
    }
    else if (percent < 60)
    {
        element.TagCategory = "mediumTag";
    }
    else if (percent < 80)
    {
        element.TagCategory = "largeTag";
    }
    else
    {
        element.TagCategory = "largestTag";
    }

    processedTags.Add(element);
}
Breathtaking answered 10/12, 2008 at 0:54 Comment(0)
O
9

I would really recommend using http://thetagcloud.codeplex.com/. It is a very clean implementation that takes care of grouping, counting and rendering of tags. It also provides filtering capabilities.

Outwash answered 10/6, 2009 at 12:10 Comment(2)
Seconded, I've just implemented it and it does everything I need out of the box.Aerodrome
FYI, the code is still available for download in the Internet Archive: web.archive.org/web/20210701013543/https://archive.codeplex.com/…Sacttler
S
5

Take a look at http://sourcecodecloud.codeplex.com/ enter image description here

Schlesien answered 8/2, 2012 at 14:3 Comment(0)
M
4

Here is an ASP.NET Cloud COntrol, that might help you at least get started, full source included.

Memoirs answered 10/12, 2008 at 0:47 Comment(2)
The link you provided is now dead.Thurber
Still dead. Is it supposed to point to codeproject.com/Articles/14661/Cloud-Control-for-ASP-NET ?Houseyhousey
O
3

You may want to take a look at WordCloud, a project on CodeProject. It includes 430 stops words (like the, an, a, etc.) and uses the Porter stemming algorithm, which reduces words to their root for so that "stemmed stemming stem" are all counted as 1 occurrence of the same word.

It's all in C# - the only thing you would have to do it modify it to output HTML instead of the visualization it creates.

Ordzhonikidze answered 10/12, 2008 at 1:21 Comment(0)
P
1

Have a look at this answer for an algorithm:

Algorithm to implement a word cloud like Wordle

The "DisOrganizer" mentioned in the answers could serve your purpose. With a little change, you can let this "Disorganizer" to serve an image, the way you wanted. PS: The code is written in C# https://github.com/chandru9279/zasz.me/blob/master/zasz.me/

Peloria answered 19/6, 2012 at 12:18 Comment(0)
L
1

Take a look at this. It worked for me. There is a project under Examples folder named WebExample which will help you for solving this. https://github.com/chrisdavies/Sparc.TagCloud

Leveille answered 21/8, 2013 at 11:21 Comment(0)
D
0

I'm not sure if this is exactly what your looking for but it may help you get started:

LINQ that counts word frequency(in VB but I'm converting to C# now)

Dim Words = "Hello World ))))) This is a test Hello World"
Dim CountTheWords = From str In Words.Split(" ") _
                    Where Char.IsLetter(str) _
                    Group By str Into Count()
Dignify answered 10/12, 2008 at 0:44 Comment(0)
C
0

You could store a category and the amount of items it has in some sort of collection, or database table.

From that, you can get the count for a certain category and have certain bounds. So your parameter is the category, and your return value is a count.

So if the count is >10 & <20, then apply a .CSS style to the link which will be of a certain size.

You can store these counts as keys in a collection, and then get the value where the key matches your return value (as I mentioned above).

I haven't got source code at hand for this process, but you won't find a simple function to do all this for you either. A control, yes (as above).

This is a very conventional approach and the standard way of doing it from what I've seen in magazine tutorials, etc, and the first approach I would think of (not necessarily the best).

Cranmer answered 10/12, 2008 at 1:17 Comment(0)
B
-1

The Zoomable TagCloud Generator which extracts keywords from a given source (text file and other sources) and displays the TagCloud as Zooming User Interface (ZUI)

Berg answered 13/6, 2012 at 19:52 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.