In my previous post (Can Steemit Get Rid of Spam and Abuse?) I discussed the importance of dealing with spam on a fundamental level (e.g. blockchain) instead of the user interface level (web interface) which does nothing but mask the underlying garbage. I've been met with criticism that I would like to address in detail. Before I do that, I'd like to share some statistics on the current state of spam.
Spam Stats
Spam on Steemit comes mainly in two flavors: comments and memos. I'll be tackling the memos because they're easier to study. I did a few statistics focused on the spam in the memos from 2017-01 to 2018-05. I looked at three values:
- The total count and size
- The count and size of memos >= 200 characters (bytes)
- The count and size of memos containing the word
resteem
About resteem services, here's a few recent ones I got in my memos. They are all above 200 characters in length:
Greetings! Want to promote your post? Get more upvotes and followers with our resteem and upvote service! Get your post resteemed to 9000+ followers, a minimum of 40+ upvotes, and a @anonwhale upvote (1200 STEEM POWER)! Send 1.000 SBD or 0.800 STEEM to @anonwhale with your post URL as the memo!
!!! This account is the advertising account for the @Byresteem account. !!! Hello, This month +600 people have used ByResteem.I can promote your post. 27.000 Followers + Upvote @byresteem 7000+ SP Upvote with min +200 Different accounts + New followers + Loyality bonus FREE . Send 2 SBD or 2 STEEM To ByResteem URL as Memo Service ACTIVE.
PROMOTE YOUR POST Your post will be resteemed to 10000+ followers on multiple Steemit accounts Just send ONLY 0.3 SBD or 0.3 STEEM to @cryptomoneymade with post URL in the memo Service Active 24/7
Based on those examples, and the assumption that a common memo is only a few words and/or a URL (memo to bid bots), which rarely goes above 150 characters (bytes), let's consider that anything higher than 200 is a strong indication of spam. There are exceptions of course, like encrypted memos, but those are very negligible in the overall picture (I looked them up).
The Results
Monthly Memo Count
Month | Total | >=200 | Resteem |
---|---|---|---|
2017-01 | 26188 | 5532 | 45 |
2017-02 | 21407 | 4254 | 13 |
2017-03 | 32473 | 4066 | 24 |
2017-04 | 27890 | 3314 | 33 |
2017-05 | 60713 | 11822 | 90 |
2017-06 | 150796 | 15279 | 278 |
2017-07 | 278183 | 44010 | 2313 |
2017-08 | 595543 | 127601 | 54784 |
2017-09 | 462380 | 117123 | 20681 |
2017-10 | 447681 | 107181 | 8005 |
2017-11 | 433096 | 115419 | 19039 |
2017-12 | 664779 | 139484 | 9805 |
2018-01 | 1361190 | 345875 | 27629 |
2018-02 | 1940340 | 709957 | 40264 |
2018-03 | 2148478 | 821438 | 143316 |
2018-04 | 2121434 | 737484 | 307399 |
2018-05 | 2307300 | 916292 | 431905 |
Total | 13079871 | 4226131 | 1065623 |
Monthly Memo Size (bytes)
Month | Total | >=200 | Resteem |
---|---|---|---|
2017-01 | 2537918 | 1411550 | 4684 |
2017-02 | 1885462 | 1082526 | 1756 |
2017-03 | 2394982 | 1110444 | 2212 |
2017-04 | 2917862 | 1650032 | 3768 |
2017-05 | 5353836 | 3324722 | 13182 |
2017-06 | 14772056 | 5674366 | 44966 |
2017-07 | 34775292 | 14703050 | 582056 |
2017-08 | 129742178 | 81317174 | 55174380 |
2017-09 | 81081340 | 45195078 | 9063728 |
2017-10 | 76464296 | 41416694 | 3997320 |
2017-11 | 75714554 | 42275132 | 8328084 |
2017-12 | 96628432 | 45691720 | 3699114 |
2018-01 | 217610828 | 112216334 | 6663676 |
2018-02 | 374562438 | 236764808 | 9090078 |
2018-03 | 454936186 | 315553208 | 73810742 |
2018-04 | 475030746 | 334207646 | 172938556 |
2018-05 | 584496862 | 443932922 | 248580856 |
Total | 2630905268 | 1727527406 | 591999158 |
As we can see, there's been some significant increase in memos and spam with an explosion that started in 2018-01. Coincidentally, that was the period when SBD was trading at $10-$15. I guess high SBD prices were a magnet for spammers. Even though the SBD price started to go down, the memo spam kept growing. Let's keep in mind that the user base has been steadily growing too, so it's another factor in spam growth.
What's interesting is the rate at which the spam grew. By the total count, the rise wasn't too sharp, only 32% and 8% respectively for >=200
and resteem
memos. However, the size is a different alarming story. The total memo size is 2.45GB, where >=200
is eating 1.61GB (66%) and the resteem
is 0.55GB (23%). That's what concerns me most. Currently the blockchain file is around 115GB, so 1.61GB is 1.4% (resteems
are roughly included in the >=200
category). This may not seem much to some, but imagine the numbers in a few months.
Criticism
There was some interesting criticism about my idea proposed in: Can Steemit Get Rid of Spam and Abuse?, in the comments and other channel discussions. I appreciate the feedback and here is my response to each.
Implementing a personalize blacklist for each user, in the blockchain side, would add more complexity to the code.
The blockchain code is already complex. For example, each user is allowed to vote for 30 witnesses, did that add more complexity? The same principle applies here, to have a limited personal blacklist (e.g. 30-50) and I don't think that would add any more complexity. Besides, Steemit Inc. is working on SMTs, communities, oracles, AppBase, etc.. which are far more complex than what we currently have. But for argument's sake, even if a personal block list would add complexity, isn't it worth doing to improve the platform for a cleaner future?
It's better to keep spam filtering on the UI side
Here are the current anti-spam measures that we have:
- @steemcleaners, @cheetah, @spaminator and others (operated by real users who spend their time and energy hunting for spammers, flagging them and maintaining blacklists).
- Muting in the web UI (hides the unwanted account comments to the user, but it's visible to the rest of the world).
I don't think that's sufficient because those measures do not address the issue on a more fundamental level. We need better measures, just like email spam filters, that prevent the spam from ever reaching the system and being stored on it. Why? because spam equals bloat, it eats resources and doesn't have any benefits. Would you let garbage accumulate in your house? At some point, the whole house will stink.
Having a (personal) block list is censorship which goes against the principle of a blockchain
Again, I invoke the example of the stinky house. It's not censorship if I want to keep my own house clean. If some people consider putting out the garbage censorship, they have the wrong definition of what censorship is. Is it censorship to block bad actors? Remember that JSON spammer a while ago who was spamming the crap out of the blockchain and who caused serious problems to the network? After all, the spammer was expressing him(her)self for BiGGaDiCK appreciation, right? Did we let it go unchecked in the name of non-censorship? Of course not, in fact, Steemit Inc promptly developed a patch to minimize such attacks in the future. That's right, spam is a form of attack and spammers do not deserve any privileges, because if we give it to them, they will take advantage if left unchecked. So censorship is not the issue here.
We could additionally block spammers in the UI by preventing them from posting via steemit.com
Interesting thought , when combined with blacklists it would work for manual spammers, but spammers mostly run bots, and bots do not use steemit.com
to broadcast their crap. They are automated processes that interact directly with the blockchain, not web interfaces.
If we block spammers in a personal blacklist, they can always make more accounts to spam with
To me that's not a valid argument. The more accounts argument can be applied to pretty much any circumstance on Steem and it's been an open door to all sorts of abuses. Even some whales abuse the platform (without spamming) via their bots and multiple accounts. We can only hope that Steemit's on-boarding filters keep improving to prevent spammers/abusers from creating additional free accounts (because that's what most of them do). Other than that, anyone can create accounts once they have a foothold in the platform.
Rising Problem and How to Deal With It
The problem with a social media powered by a blockchain is the inability to modify transactions. Incidentally, we can't delete of all spam that has happened so far. In my view we need to truly block and prevent spam from happening and accumulating in the first place. One way to do that is on a fundamental blockchain level, not the UI. I have proposed that each user be allowed to have their own blacklist, a reasonable figure around 50. So when a spammer tries to send a memo (or even a comment) to the user, the blockchain would reject their transaction. Simple and effective. Spam blocking and filtering works already in the real world (e.g. emails), so why not implement it for the Steem blockchain? I'm not saying that this solution will get rid of spam at 100% (it's impossible), but at least it would minimize the impact of spam without having to waste voting power on flagging and hiding the content, for example. If anyone has other ideas, I'd be glad to hear them.
Available & Reliable. I am your Witness. I want to represent You.
🗳 If you like what I do, consider voting for me 🗳
If you never voted before, I wrote a detailed guide about Voting for Witnesses.
Go to https://steemit.com/~witnesses. My name is listed in the Top 50. Click once.
Alternatively you can vote via SteemConnect
https://v2.steemconnect.com/sign/account-witness-vote?witness=drakos&approve=1