top of page

Leetchi visualisation challenge

Description of the challenge

?xml version="1.0" encoding="UTF-8"?

Crowdfunding platforms have known an increasing success over the last few years. With more than $43 Billion euros raised in 2016, this alternative to traditional finance processes has received a important interest in the research community. While crowdfunding by counterparts has been extensively studied, until now only a very few works have focused on donation-based crowdfunding platforms. In this challenge, we propose to analyze the Leetchi crowdfunding platform and the impact of 70 variables on the success of Leetchi money-pot.

The variables are extracted from Twitter and Leetchi website and can be categorized into the four following groups: Leetchi project page, Twitter community, tweets content and the tweets diffusion process.

The dataset is composed of 1,415 Leetchi projects along with 50119 project-related tweets.

The goal of the challenge is to provide a set of recommendation for decision support regarding the possible strategy to apply and key factors to take into account while trying to get funds from a donation perspective.

Software requirements

Click on the link above and select Get Started. On the form, enter your university email address for “Business email”; and under "Organization", please input the name of your school.

A quick start guide for Tableau software is available.


The Leetchi dataset is available for download on Harvard dataverse.


List of all variables included in the dataset

  1. Target: reflects rather the project is dedicated to a single person, a group of persons or an organization

  2. #Facebook: the number of times the “Facebook” word and Facebook URLs are detected in the description. The previous works found that the presence of Facebook links is correlated with the success of all-or-nothing project

  3. #Twitter: the number of times the “Twitter” word and Twitter URLs are detected in the description

  4. #Media: the number of videos and images that are used in the project page

  5. Length: the number of characters used to write a description

  6. SMOG: the estimation of the years of education required to understand the description. The formula remains on the number of sentences and the number the polysyllables observed in these sentences. The python library was used to computed the SMOG index such as the other metrics of text complexity.

  7. FleschKincaidGrade: the grade score using the Flesch-Kincaid Grade Formula. This formula is computed using the total of words divided by the total sentences such as the total syllables by the total of words.

  8. ColemanLiauIndex: the grade score using the Coleman-Liau Formula. This formula remains on the average number of letters per 100 words and the average number of sentences per 100 words.

  9. AutomatedReadabilityIndex: the grade score computed using the Automated Readability Index which outputs a number that approximates the grade level needed to comprehend the text. It relies on the total of characters by total of words such as the total words by the total of sentences.

  10. LinsearWriteFormula: the grade score using the Linsear Write Formula. This formula is based on the number of easy words and hard words from a sample of the text.

  11. GunningFog: FOG index of the given text - weighted average of the number of words per sentence, and the number of long words per word.

  12. DifficultWords: the number of difficult words by using as reference the Dale-Chall Word List of familiar words. This list contains three thousand familiar words that are known in reading by at least 80% of the children in grade 5.

  13. DaleChallReadabilityScore: the grade level using the New Dale-Chall Formula. This metric is computed using the ratio of difficult words by words and the ratio of words by sentences.

  14. #Click: the number of times the word “click” were used (e.g. “Click on donate to contribute to the project”)

  15. #Secure_payment: the number of times an expression “secure payment” were used

  16. #Currency: the number of times currency symbols were used

  17. #Numbers: the number of times numbers were used in the project description

  18. #Link: the number of links that are used in the description

  19. #Email: the number of e-mails that are used in the description

  20. Language: the language used in the description (French, English, German, Spanish, other). The language were detected using the ISO 3166-1 alpha-2 country code available in the project URL. When unavailable, the python library was used to identify the language.

  21. #Promoters: the number of unique profiles that tweeted/retweeted about the project

  22. #Tweets: the number of tweets mentioning the project

  23. #Replies: the number of replies to tweets mentioning the project

  24. #Retweets: the number of retweets mentioning the project

  25. #Mentions: number of mentions in project-related tweets

  26. #Duration: the number of days between the first project-related tweet till the last project-related tweets observed in the dataset (duration of the campaign)

  27. #ActiveDays: the number of days where at least one tweet was recorded during the campaign

  28. #InactiveDays: the number of days of the campaign with no observed activity. This feature is computed as #Duration - #ActiveDays

  29. AVG_favorites: the average number of times the Twitter promoters likes tweets before joining the community

  30. MAX_favorites: the highest number of likes of profiles belonging to the community

  31. AVG_statuses: the average number of tweets published by the promoters before joining the community

  32. MAX_statuses: the number of tweets published by the most active promoter before joining the community

  33. AVG_friends: the average number of friends of unique promoters of the community

  34. MAX_friends: the maximum number of friends that promoters have

  35. #Influencers500: the number of promoters in the community having more than 500 followers on Twitter

  36. #Influencers1k: the number of promoters in the community having more than 1,000 followers on Twitter

  37. #Influencers10k: the number of promoters in the community having more than 10,000 followers on Twitter

  38. #Influencers50k: the number of promoters in the community having more than 50,000 followers on Twitter

  39. #Influencers100k: the number of promoters in the community having more than 100,000 followers on Twitter

  40. MAX_followers: the number of followers of the most influential profile in the community

  41. AVG_followers: the average number of followers of unique promoters of the community

  42. Leetchi: indicates if the project were promoted by the official Leetchi Twitter account

  43. #Help: the number of times the word “Help” were mentioned in project-related tweets

  44. #HashtagRt: the number of times the hashtag #Rt were used in tweets

  45. #Retweet: the number of times the word “Retweet” were used in tweets

  46. #RT: the number of times the word “RT” were used in tweets

  47. #Mobilise: the number of times the word “mobilise” were used in tweets

  48. #Solidarity: the number of times the words including “solidarit” were used in tweets

  49. #Important: the number of times the word “Important” and urge were used in tweets

  50. #Thank: the number of times acknowledgment related words were used in tweets

  51. #Hashtags: the number of hashtags used during the project-related Twitter campaign

  52. AVG_hashtags: the average number of hashtags by tweet

  53. AVG_mentions: the average number of mentions by tweet

  54. Tweetsentiment: the general sentiment of tweets (computed using the AFINN dictionary of words tagged with sentiment scores [IMM2011-06010])

  55. MAX_Sentiment: the most positive tweet score

  56. MIN_Sentiment: the most negative tweet score

  57. AVG_Tweetsentiment: the average sentiment of tweets

bottom of page