In a previous post of mine, I wanted to thank my voters by mentioning them since I was making a donation to charity on their behalf thanks to their votes. To accomplish this, I needed to grab a list, and format it so it didn't create a note that was a mile long. For this task, I decided to use Notepad++ for Windows. If you don't have it, I highly recommend you download it here:
https://notepad-plus-plus.org/downloads/
STEP 1: Grab Your List
Step 2: Paste The Unformatted List Into Notepad++
Open up Notepad++, create a new note, and paste in your unformatted list using CTRL-V.
STEP 3: Perform REGEX Magic
Refer To The Gif Below To See The Following Steps In Action. Note: You must switch from "Normal" mode to "Regular expression" mode via Search - Replace to perform this step.
Delete vote amounts by finding :.* (colon-dot-star), and "Replace All" with nothing
Insert an @ by finding ^ (carot), and "Replace All" with @ (at sign)
Add a comma by finding $ (dollar sign), and r"Replace All" with , (comma)
Step 4: Kill The Blank Space
- Delete white space by going to "Edit" - "Blank Operations" - "Remove Unnecessary Blank And EOL"
From here you should now have a single-line list of all your voters, which you can now copy out of Notepad++ (CTRL-C), and paste (CTRL-V) into your post or thank you comment. This method could also be used for downvoters as well by selecting the downvote number instead.
Notepad++ can be a powerful tool to manipulate raw data without having to to create a Python or other scripting language script. Regular expressions offer a lot of flexibility in this regard. The examples I used were very simple, but they can be made much more complex based upon your needs. For more information on using REGEX with Notepad++, go to:
https://community.notepad-plus-plus.org/topic/15765/faq-desk-where-to-find-regex-documentation
Thanks for reading. If you have a faster way of accomplishing the same task within Hive, please share it in the comments. Aside from Hive, this technique can be used on any vertical list of raw data.