Hi guys In the Documentation they mention that we ...
# troubleshooting
a
Hi guys In the Documentation they mention that we can use Inbuilt functions in transform-Function, Is There any way to use UDF as a transform function ?? if not can we use multi line groovy script instead ? I tried to write the script in one line with " ; " between the lines and it give me an error
Copy code
"ingestionConfig": {
        "transformConfigs": [{
          "columnName": "fcm_token",
          "transformFunction": "Groovy({import javax.crypto.Cipher;import javax.crypto.spec.SecretKeySpec;Cipher cipher = Cipher.getInstance('AES/ECB/PKCS5Padding');cipher.init(Cipher.ENCRYPT_MODE, new SecretKeySpec('1234567812345678'.getBytes('UTF-8'), 'AES'));cipher.doFinal(fcm_token_raw.getBytes('UTF-8')).encodeBase64())).encodeBase64()},fcm_token_raw)"
        }]
the goal behind this groovy script is to encrypt the column .. and I didn't find any build in functions in pinot can encrypt data ( sh, sha256 or md5 can only create irrevisible hash)
m
You can use UDF as transform function. However, afaik, you can’t currently call transform function from within Groovy. It should be easy to add the UDF in pinot code base (you can file a PR).
n
btw multi line groovy is possible. for example here’s one i’d written recently:
Copy code
"transformConfigs": [
        {
          "columnName": "labels_json_str",
          "transformFunction": "Groovy({def labelsMap = [:]; for (int i = 0; i < labels.size(); i++) { labelsMap[labels[i].element.key] = labels[i].element.value; }; return groovy.json.JsonOutput.toJson(labelsMap)}, labels)"
        }
      ]
there might be some other syntax error in the script.
if you decide the groovy approach, try the script first in some online compiler
a
as expected when I tried to use my UDF as transform function I got an error .. when i tested it in a query it works fine
Copy code
"ingestionConfig": {
        "transformConfigs": [{
          "columnName": "fcm_token",
          "transformFunction": "encrypt(cm_token_raw)"
        }]
      },
ERROR : 2022/06/27 164816.179 INFO [AddTableCommand] [main] {"_code":400,"_error":"Invalid transform function 'encrypt(fcm_token_raw)' for column 'fcm_token'"} but when I tried @Neha Pawar approach it finally worked 🥳🥳 Thank you both for your help 😍
n
great! can you tell us more about how you wrote the custom function and how you added it to Pinot? We’d like to understand why that didnt work too
a
Here is the encrypt UDF that works only in query mode not as transform function. I added it in plugins as Jar file like in the documents https://gist.github.com/AhmedElsagher/fd941e7d6d9607167a52825c8e370d03 I tried to write encrypt(cm_token_raw) and EncryptUDF.encrypt(cm_token_raw) in transformConfigs and it worked only as groovy script
Copy code
"transformFunction": "Groovy({import javax.crypto.Cipher;import java.security.Key;import javax.crypto.spec.SecretKeySpec;Key aesKey = new SecretKeySpec('1234567812345678'.getBytes('UTF-8'), 'AES');Cipher cipher = Cipher.getInstance('AES/ECB/PKCS5Padding');cipher.init(Cipher.ENCRYPT_MODE, aesKey);return cipher.doFinal(fcm_token_raw.getBytes('UTF-8')).encodeBase64()},fcm_token_raw)"
n
@ahmed would you mind creating an issue in Github for this? (the custom udf not working in ingestion part)
e
We ran into same issue. may I know if issue was created? We can call the scalar udfs in query but not at ingestion. cc: @Lee Wei Hern Jason
✔️ 1
Solved.