I am trying to create a webhook for a inbound API from a ven cfml #cfml-general

I am trying to create a webhook for a inbound API ...

Jason Roozee

08/14/2024, 6:13 PM

I am trying to create a webhook for a inbound API from a vendor. The vendor is posting a JSON string in the content body (

getHttpRequestData().content

) and one of the values in the JSON contains a UCS2 encoded string (essentially utf-16LE). Normally, if I am getting a utf-16 string in a form scope, I would simply call setEncoding("form", "utf-8") - and that's it. But I am having a heck of a time trying to convert the value coming from the deserializeJSON of the

getHttpRequestData().content

. I've tried UTF-16, UTF-16BE and LE. The JSON string is:

{"msg_id": "f25ac62a-e0a2-4445-9359-06858fd1833c", "message": "桥�?�?"}

I've tried the following:

Copy code

<cfset content = getHttpRequestData().content>

<cfset data = deserializeJSON(content)>

<cfset msg=  CharsetEncode(CharsetDecode(data.message, "UTF-16"), "UTF-8")>
<cffile action="append" file="#ExpandPath(".")#\callback.log"   output="#now()# UTF-16:#msg#"/>

<cfset msg=  CharsetEncode(CharsetDecode(data.message, "UTF-16BE"), "UTF-8")>
<cffile action="append" file="#ExpandPath(".")#\callback.log"   output="#now()# UTF-16BE:#msg#"/>

<cfset msg=  CharsetEncode(CharsetDecode(data.message, "UTF-16LE"), "UTF-8")>
<cffile action="append" file="#ExpandPath(".")#\callback.log"   output="#now()# UTF-16LE:#msg#"/>

Any ideas? The content (as utf-8) in the "message" is the word "hey"

cfvonner

08/14/2024, 6:42 PM

Not sure if this would work or not, but can you decode the whole

getHttpRequestData()

.content before deserializing the JSON?

Jason Roozee

08/14/2024, 6:43 PM

I did try that but that just messed up the rest of the JSON that was already in utf-8. Only a part of the JSOn is in utf-16le

Jason Roozee

08/14/2024, 6:57 PM

Just incase it was a JSON parsing issue, I tried extracting the content from the JSON variable directly- same thing:

Jason Roozee

08/14/2024, 8:04 PM

I found a solution. 1. Write the getHttpRequestData().content to a file with no charset set 2. load the file as binary 3. convert the binary to utf-8 4. Deserialize the data 5. Then use CharsetDecode UTF-16BE then reencode to UTF-8 This seems to be the only way I can get it to work

Copy code

<cfset content = getHttpRequestData().content>
<cffile action="write" file="#ExpandPath(".")#\rawdata.bin" output="#content#"/>
<cfset binaryDataRead = fileReadBinary("#ExpandPath(".")#\rawdata.bin") >
<cfset content = CharsetEncode(binaryDataRead, "UTF-8")>
<cfset data = deserializeJSON(content)>
<cfset msgutf16 = data.content>
<cfset msg=  CharsetEncode(CharsetDecode(msgutf16, "UTF-16BE"), "UTF-8")>

🤢 1

denny

08/14/2024, 8:43 PM

I was going to mention that it would be for all the content, not just a portion of it, which it looks like you have sorted out

Jason Roozee

08/14/2024, 8:52 PM

Well I jumped the gun. It worked for some decoding, but it's not working for all tests. Arhg.

Jason Roozee

08/14/2024, 8:53 PM

I was thinking of trying to read getPageContext().getRequest().getInputStream() next - which I've never done before.

denny

08/14/2024, 8:54 PM

I think it's just this:

Copy code

<cfset content = getHttpRequestData().content>
<cfset data = deserializeJSON(content)>

where you are doing implicit conversion to whatever the system is (via deserializeJSON). Instead of that, you want to always specify a charset when you're switching from binary (getHttpRequestData().content) to text. So theoretically do what you were doing but put this before the the

deserialzeJson

call

Copy code

<cfset json =  CharsetEncode(CharsetDecode(getHttpRequestData().content, "UTF-16"), "UTF-8")>
<cfset object = deserializeJSON(data)>

denny

08/14/2024, 8:56 PM

Also do you have your server set to UTF-8 as the default? Before on windows it would use the windows CP-1252 but I think everything is using UTF-16 now.

Jason Roozee

08/14/2024, 8:58 PM

I tried that. The issue is the inbound request is coming in as content-type "application/json" (not utf-8) and somewhere (IIS or CF) is then treating the characterset accordingly which is making it a pain to try to convert.

denny

08/14/2024, 9:00 PM

If you do

isBinary(getHttpRequestData().content)

what does it say?

Jason Roozee

08/14/2024, 9:00 PM

Something in the middle is manipulating the data so it won't properly decode. I feel confident trying to get the RAW data from getPageContext().getRequest().getInputStream() would work

Jason Roozee

08/14/2024, 9:01 PM

That returns false.

Jason Roozee

08/14/2024, 9:02 PM

because of the content-type directive, normally application/json isn't going to be binary. but I need to access it as binary to be able to convert it.. getPageContext().getRequest().getInputStream() seems to be the only way

denny

08/14/2024, 9:04 PM

That is a pretty common thing to have to do (using the inputStream) but I thought there was a way without dropping into java

denny

08/14/2024, 9:05 PM

Pretty sure you'd want to do any converting prior to the

deserializeJSON

call regardless

Jason Roozee

08/14/2024, 9:07 PM

Agreed. I've never used getInputStream, know if there are any examples of using it out there? ChatGPT is struggling at it. lol

denny

08/14/2024, 9:09 PM

I'm certain there are with how often I was messing with the pageContext in the past lol, probably Ben has a nice blog post on it. 😄

Jason Roozee

08/14/2024, 9:09 PM

Thanks - I'll search around. I think I've done it once before, I'll have to dig around

denny

08/14/2024, 9:21 PM

This is such a common problem for CF I'm surprised nobody has chimed in with the definitive solution

💯 1

denny

08/14/2024, 9:25 PM

The only thing to be careful about around the IO streams is closing and flushing. Otherwise they should work fine. But I'm almost certain there is a clean way of solving this with the encoding and charset functions available…

✅ 1

bkbk

08/16/2024, 12:22 PM

Looking at the code that works, I wondered what would happen if you did away with the file-write and file-read operations:

Copy code

<cfset content = getHttpRequestData().content>
<cfset content = charsetEncode(content, "UTF-8")>
<cfset data = deserializeJSON(content)>
<cfset msgutf16 = data.content>
<cfset msg = charsetEncode(charsetDecode(msgutf16, "UTF-16BE"), "UTF-8")>

denny

08/16/2024, 4:20 PM

Hmm, might need to do a

content.getBytes()

(since it's text vs. binary) for that

charsetEncode

but that should be the same, neh?

7 Views

Open in Slack

Previous Next