Hello, I found this PostgreSQL function that uses ...
# sql
o
Hello, I found this PostgreSQL function that uses regex to extract hashtags from a string. It works well, but I can't be sure if it's safe to use with the
execute
and
captures
? Would I need to escape something?
Copy code
SQL
CREATE FUNCTION parse_tokens(content text, prefix text)  
  RETURNS text[] AS $$
    DECLARE
      regex text;
      matches text;
      subquery text;
      captures text;
      tokens text[];
    BEGIN
      regex := prefix || '(\S+)';
      matches := 'regexp_matches($1, $2, $3) as captures';
      subquery := '(SELECT ' || matches || ' ORDER BY captures) as matches';
      captures := 'array_agg(matches.captures[1])';

      EXECUTE 'SELECT ' || captures || ' FROM ' || subquery
      INTO tokens
      USING LOWER(content), regex, 'g';

      IF tokens IS NULL THEN
        tokens = '{}';
      END IF;

      RETURN tokens;
    END;
  $$ LANGUAGE plpgsql STABLE;
k
It looks fine as long as you control the
prefix
parameter
If you only use it with '#', I'd hardcode that - to make it simpler and safer
oh, hmm, even if you don't control
prefix
it should be fine (maybe except for "regexp DoS")
o
thank you very much!