Convert CamelCase to Underscores using SED and tr: GreenPlum MPP Refactor


All of our SQL Server databases use CamelCase as the naming convention. Unfortunately, GreenPlum MPP does not support CamelCase (without the use of quotes that is). Since I was transitioning a table from SQL Server to GreenPlum MPP and because I’m becoming a bash wizard (ha!) I wondered if there was a simple way to convert camel case column names to underscores. It took a bit of trial and error and some expert level googling but the following did the trick. (Feel free to leave a comment if there is a more efficient way to do this).

From the directory where you have your file run the following command. It will identify upper case characters in the file called phil_camel.txt and insert an underscore before them. The updated file will be called phil_underscore.txt (the original file will not be modified).

sed -r 's/([a-z]+)([A-Z][a-z]+)/\1_\l\2/g' phil_camel.txt>phil_underscore.txt

If you examine the file you’ll see that the underscores have been added but the uppercase characters have been preserved. Generally when you use underscores to separate words in a database you use only lowercase letters. The following command will replace all upper case letters in phil_underscore.txt with lowercase letters and create a new file called phil_under_lower.txt.

tr '[:upper:]' '[:lower:]' <phil_underscore.txt> phil_under_lower.txt

That’s it. You’re now a bash genius like me.

Why is this picture here? Because it showed up when I googled "bash".
Why is this picture here? Because it showed up when I googled “bash”.

 

 

 

 


Leave a Reply

Your email address will not be published.