POST with no body calls RPC function with json argument; crashes · Issue #2399 · PostgREST/postgrest

link管理
链接快照平台
输入网页链接，自动生成快照
标签化管理网页链接
Description of issue

We have an overloaded RPC method, something like this:
CREATE OR REPLACE FUNCTION api_x.hello(_payload json) RETURNS jsonb AS $$
CREATE OR REPLACE FUNCTION api_x.hello() RETURNS jsonb AS $$
Note there's one version which takes no argument.
Previously, Postgrest would call the second one if a request didn't have a body. After upgrading, it now instead tries to use the first one which predictably crashes because the empty string is not valid JSON ("invalid input syntax for type json").
I'm not sure what changed but I imagine it's to do with #1735 somehow. We do send thePrefer "params=single-object" header, and if we don't the no-arg RPC function becomes callable again. The argument version on the other hand stops being callable.
Discussion
I guess it's debatable whether this is a bug or not. The Prefer "params=single-object" header does indicate we want to send a JSON object, so sending the header with an empty body is poorly defined.
I can't see any harm in restoring the old behaviour should you choose to do so. Passing an empty string as JSON will never work, so when the Content-Type is json (or not set), and the Content-Length is 0, it makes sense to always select the parameter-less version of an RPC function.
For us, the header is added in the nginx proxy automatically (to support some legacy clients). So the fix was simply to not send that header when the content-length is 0. Still, I'm posting this issue in case it might help others and to highlight it is an undocumented backwards incompatible change.
I guess it's debatable whether this is a bug or not.
According to the docs, Prefer: params=single-object works with a function with a single json parameter, not with functions without params, so I'm tempted to think the old behavior was not intended, in my opinion. It could be argued that using this header, even if the body is empty, I'm guaranteed to call the function with the single json parameter and not fallback to other and get unexpected results. Maybe it needs more discussion, but I'm inclined to believe that the current behavior should be kept.
          I don't disagree, but do note even with the current behaviour you won't be "guaranteed" to call that function since the query will crash before the function is invoked! The current behaviour might be semantically correct but practically useless at the same time.
I think if we want to keep the "always use single parameter JSON function when I give this header, even when I don't send a payload" behaviour then it would make sense to either:
fail cleanly with a proper error code instead of crashing (400 Bad Request -- "body required", or maybe 411 or 412)
or for Postgrest to automatically substitute an empty JSON document {}
In our case at least the second option would work nicely.
And again it's not the end of the world if the current behaviour is left as is. I want to emphasise I posted this issue more to highlight the change in behaviour and help others who stumble onto it. It might be considered unexpected and an undocumented change in behaviour (even if the previous behaviour was indeed unintentional).
I don't disagree, but do note even with the current behaviour you won't be "guaranteed" to call that function since the query will crash before the function is invoked!
Ah, well, PostgREST does call the function under the hood, with an empty string parameter: select * from api_x.hello(''); and will throw the error you mention: Invalid input syntax for type json. But I agree that the error message does not seem helpful and looks like it's a random bug.
Now, doing the same test with GET does not throw the error and treats empty params (in the query string) as an empty object, we could use a similar logic here and say that an empty body (empty raw string) means we should use an empty json object as you suggested.
All in all, I like both of your suggestions. I prefer the second one, but maybe the error message is less opinionated (could be argued that an empty array instead of empty object is also valid). I'll tag it as error-message for now.
Just to leave a note of what I found while reviewing this. When doing this request:
POST /rpc/hello
Content-Type: application/json
Prefer: params=single-object
  {"a": 1},
  {"b": 1}
It returns the "All object keys must match" error, which means it's parsing the body here:
      postgrest/src/PostgREST/Request/ApiRequest.hs
        Lines 385 to 387
      35a114a
          Regarding Prefer: params=single-object. Originally, arguments for a function were meant to be specified like:
/rpc/func?id=arg.3&name=arg.nom




    

Following that idea and with underscore operators, we could avoid the special Prefer and do:
POST /rpc/func?myjson=_arg.*
* meaning the whole body.
Right now what we have for POST is basically a:
POST /rpc/func?id=_arg.id&name=_arg.name
{"id": 1, "name": "nom"}
I would consider this a bug.
The Prefer: params=single-object indicates that the request body should be handled in a special way. But that implies that there actually is a request body at all. Prefer headers are optional for the server to apply - we should not try to do so in a situation where it doesn't make sense.
I misunderstood single-object. It also works with arguments in the query string - so having "no request body" is actually fine.
I'd still consider this a bug, but would expect the function to be called with an empty object instead of without argument when this header is passed.
Edit: This is exactly what happens on a GET request - providing no arguments will call the function with an empty json object.
          Hm. There seem to be a few more inconsistencies with POST and Prefer: params=single-object. Given:
CREATE OR REPLACE FUNCTION test.hello(_payload json) RETURNS jsonb LANGUAGE SQL AS $$ select $1::JSON $$;
CREATE OR REPLACE FUNCTION test.hello() RETURNS jsonb LANGUAGE SQL AS $$ select '"no payload"'::JSON $$;
The following happens:
expected results
POST /rpc/hello
Prefer: params=single-object
Content-Type: application/json
{"a":"b"}
{"a": "b"}
POST /rpc/hello
Prefer: params=single-object
Content-Type: application/json
POST /rpc/hello
Prefer: params=single-object
Content-Type: application/json
[{"a":"b"}]
[{"a": "b"}]
POST /rpc/hello
Prefer: params=single-object
Content-Type: application/json
unexpected results
POST /rpc/hello
Prefer: params=single-object
Content-Type: application/json
"string"
POST /rpc/hello
Prefer: params=single-object
Content-Type: application/json
POST /rpc/hello
Prefer: params=single-object
Content-Type: application/json
POST /rpc/hello
Prefer: params=single-object
Content-Type: application/json
POST /rpc/hello
Prefer: params=single-object
Content-Type: application/json
["string"]
{"code":"PGRST102","details":null,"hint":null,"message":"All object keys must match"}
POST /rpc/hello
Prefer: params=single-object
Content-Type: application/json
{"code":"22P02","details":"The input string ended unexpectedly.","hint":null,"message":"invalid input syntax for type json"}
Basically, when using Prefer: params=single-object with POST and a json body... why are we parsing the body at all? We could just pass it to the function unmodified - similar to what the unnamed json param would do.
params=single-object only really seems useful to me when using it with GET, but not with POST (at least not with a json body).
Interestingly, the situation is a bit different when turning the function into one with an unnamed json parameter:
CREATE OR REPLACE FUNCTION test.hello(json) RETURNS jsonb LANGUAGE SQL AS $$ select $1::JSON $$;
CREATE OR REPLACE FUNCTION test.hello() RETURNS jsonb LANGUAGE SQL AS $$ select '"no payload"'::JSON $$;
This basically allows to call the functions in the same way as with the named parameter, when using params=single-object - but additionally allows calling with POST and without the header. Now the behavior is like this:
expected results
POST /rpc/hello
Content-Type: application/json
{"a":"b"}
{"a": "b"}
POST /rpc/hello
Content-Type: application/json
[{"a":"b"}]
[{"a": "b"}]
unexpected results
POST /rpc/hello
Content-Type: application/json
"no payload"
POST /rpc/hello
Content-Type: application/json
"no payload"
POST /rpc/hello
Content-Type: application/json
"string"
"no payload"
POST /rpc/hello
Content-Type: application/json
"no payload"
POST /rpc/hello
Content-Type: application/json
"no payload"
POST /rpc/hello
Content-Type: application/json
"no payload"
POST /rpc/hello
Content-Type: application/json
["string"]
{"code":"PGRST102","details":null,"hint":null,"message":"All object keys must match"}
POST /rpc/hello
Content-Type: application/json
"no payload"
While this seems to be exactly what @aljungberg tried to achieve - I don't understand why we try to parse the json body at all, when passing it on to a "single unnamed json parameter function". I would expect everything to just go through and the no argument function to be never called except in the last case.
Basically, when using Prefer: params=single-object with POST and a json body... why are we parsing the body at all?

I don't understand why we try to parse the json body at all, when passing it on to a "single unnamed json parameter function". I would expect everything to just go through and the no argument function to be never called except in the last case.
@wolfgangwalther Yeah, the json shouldn't be parsed in that case. Likely it was done that way to reuse some code.