溫馨提示×

溫馨提示×

您好,登錄后才能下訂單哦!

密碼登錄×
登錄注冊×
其他方式登錄
點擊 登錄注冊 即表示同意《億速云用戶服務(wù)條款》

PostgreSQL 源碼解讀(176)- 查詢#94(語法分析:gram.y)#3

發(fā)布時間:2020-08-10 19:09:06 來源:ITPUB博客 閱讀:369 作者:husthxd 欄目:關(guān)系型數(shù)據(jù)庫

本節(jié)繼續(xù)介紹PostgreSQL的語法分析定義文件gram.y的第三部分Productions(產(chǎn)生式).
Bison輸入文件的組成:


%{
Declarations
%}
Definitions
%%
Productions
%%
User subroutines

一、Productions

Productions即產(chǎn)生式,這是用戶編寫的語法產(chǎn)生式,產(chǎn)生式的書寫格式如下:


S -> X \n
X -> X + X | X - X | T_NUMBER

S -> X \n成為產(chǎn)生式,第一條產(chǎn)生式的最左邊的符號成為起始符號,在這里是符號S.
為了避免出現(xiàn)遞歸解析,Bison因此會在最前面多添加一條產(chǎn)生式S’ -> S,S’為起始符號.
在Bison中,符號”:”表示一條”->”,同一個非終結(jié)符的不同產(chǎn)生式用”|”隔開,用”;”結(jié)束.每條產(chǎn)生式的后面花括號內(nèi)是一段C代碼,這些代碼在該產(chǎn)生式被應(yīng)用時執(zhí)行,成為Action(動作),產(chǎn)生式的右邊是ε(空集合)時,用注釋/* empty */代替.
產(chǎn)生式中的非終結(jié)符不需要預(yù)先定義,Bison會自動根據(jù)所有產(chǎn)生式的左邊符號來確定哪些符號是非終結(jié)符;終結(jié)符中,單字符token(token type值和字符的ASCII碼相同)也不需要預(yù)先定義,在產(chǎn)生式內(nèi)部直接用單引號括起來即可,其他類型的token則需要預(yù)先在 Definitions段中定義好,如%token ABORT_P ABSOLUTE_P ACCESS ACTION ADD_P ADMIN AFTER等,Bison會自動為這種token分配一個編號,再寫到gram.h 文件中去,打開該文件,可以看到如下代碼:


[root@localhost src]# vim ./include/parser/gram.h
...
/* Token type.  */
 44 #ifndef YYTOKENTYPE
 45 # define YYTOKENTYPE
 46   enum yytokentype
 47   {
 48     IDENT = 258,
 49     FCONST = 259,
 50     SCONST = 260,
 51     BCONST = 261,
 52     XCONST = 262,
 53     Op = 263,
 54     ICONST = 264,
 55     PARAM = 265,
 ....

編號從258開始,根據(jù)gram.y中的順序逐個定義.


...
%token <str>    IDENT FCONST SCONST BCONST XCONST Op
%token <ival>    ICONST PARAM
%token            TYPECAST DOT_DOT COLON_EQUALS EQUALS_GREATER
%token            LESS_EQUALS GREATER_EQUALS NOT_EQUALS
%token <keyword> ABORT_P ABSOLUTE_P ACCESS ACTION ADD_P ADMIN AFTER
    AGGREGATE ALL ALSO ALTER ALWAYS ANALYSE ANALYZE AND ANY ARRAY AS ASC
    ASSERTION ASSIGNMENT ASYMMETRIC AT ATTACH ATTRIBUTE AUTHORIZATION
...

這些token定義在scan.l中可直接使用.


#include "parser/gramparse.h" --> #include "parser/gram.h"

Bison會根據(jù)產(chǎn)生式以及符號優(yōu)先級轉(zhuǎn)化為LALR(1)動作表輸出到gram.c文件中去.在gram.c文件中,PG根據(jù)自定義語法文件生成一個函數(shù)int base_yyparse (core_yyscan_t yyscanner);該函數(shù)按LR(1)解析流程對詞法分析得到的token流進行解析,每當它需要讀入下一個符號時,它就執(zhí)行一次s = yylex() ,每當它要執(zhí)行一個折疊(reduce)動作時,這個reduce所應(yīng)用的產(chǎn)生式后面C代碼將被執(zhí)行,執(zhí)行完后才將相應(yīng)的狀態(tài)出棧。
下面是gram.c中yyparse的部分代碼:


/*----------.
| yyparse.  |
`----------*/
int
yyparse (core_yyscan_t yyscanner)
{
/* The lookahead symbol.  */
int yychar;
/* The semantic value of the lookahead symbol.  */
/* Default value used for initialization, for pacifying older GCCs
   or non-GCC compilers.  */
YY_INITIAL_VALUE (static YYSTYPE yyval_default;)
YYSTYPE yylval YY_INITIAL_VALUE (= yyval_default);
/* Location data for the lookahead symbol.  */
static YYLTYPE yyloc_default
# if defined YYLTYPE_IS_TRIVIAL && YYLTYPE_IS_TRIVIAL
  = { 1, 1, 1, 1 }
# endif
;
YYLTYPE yylloc = yyloc_default;
    /* Number of syntax errors so far.  */
    int yynerrs;
    int yystate;
    /* Number of tokens to shift before error messages enabled.  */
    int yyerrstatus;
    /* The stacks and their tools:
       'yyss': related to states.
       'yyvs': related to semantic values.
       'yyls': related to locations.
       Refer to the stacks through separate pointers, to allow yyoverflow
       to reallocate them elsewhere.  */
    /* The state stack.  */
    yytype_int16 yyssa[YYINITDEPTH];
    yytype_int16 *yyss;
    yytype_int16 *yyssp;
    /* The semantic value stack.  */
    YYSTYPE yyvsa[YYINITDEPTH];
    YYSTYPE *yyvs;
    YYSTYPE *yyvsp;
    /* The location stack.  */
    YYLTYPE yylsa[YYINITDEPTH];
    YYLTYPE *yyls;
    YYLTYPE *yylsp;
    /* The locations where the error started and ended.  */
    YYLTYPE yyerror_range[3];
    YYSIZE_T yystacksize;
  int yyn;
  int yyresult;
  /* Lookahead token as an internal (translated) token number.  */
  int yytoken = 0;
  /* The variables used to return semantic value and location from the
     action routines.  */
  YYSTYPE yyval;
  YYLTYPE yyloc;
#if YYERROR_VERBOSE
  /* Buffer for error messages, and its allocated size.  */
  char yymsgbuf[128];
  char *yymsg = yymsgbuf;
  YYSIZE_T yymsg_alloc = sizeof yymsgbuf;
#endif
#define YYPOPSTACK(N)   (yyvsp -= (N), yyssp -= (N), yylsp -= (N))
  /* The number of symbols on the RHS of the reduced rule.
     Keep to zero when no symbol should be popped.  */
  int yylen = 0;
  yyssp = yyss = yyssa;
  yyvsp = yyvs = yyvsa;
  yylsp = yyls = yylsa;
  yystacksize = YYINITDEPTH;
...

二、源碼

下面是gram.y產(chǎn)生式定義的部分源碼


/*
 *    The target production for the whole parse.
 */
stmtblock:    stmtmulti
            {
                pg_yyget_extra(yyscanner)->parsetree = $1;
            }
        ;
/*
 * At top level, we wrap each stmt with a RawStmt node carrying start location
 * and length of the stmt's text.  Notice that the start loc/len are driven
 * entirely from semicolon locations (@2).  It would seem natural to use
 * @1 or @3 to get the true start location of a stmt, but that doesn't work
 * for statements that can start with empty nonterminals (opt_with_clause is
 * the main offender here); as noted in the comments for YYLLOC_DEFAULT,
 * we'd get -1 for the location in such cases.
 * We also take care to discard empty statements entirely.
 */
stmtmulti:    stmtmulti ';' stmt
                {
                    if ($1 != NIL)
                    {
                        /* update length of previous stmt */
                        updateRawStmtEnd(llast_node(RawStmt, $1), @2);
                    }
                    if ($3 != NULL)
                        $$ = lappend($1, makeRawStmt($3, @2 + 1));
                    else
                        $$ = $1;
                }
            | stmt
                {
                    if ($1 != NULL)
                        $$ = list_make1(makeRawStmt($1, 0));
                    else
                        $$ = NIL;
                }
        ;
stmt :
            AlterEventTrigStmt
            | AlterCollationStmt
            | AlterDatabaseStmt
            | AlterDatabaseSetStmt
            | AlterDefaultPrivilegesStmt
            | AlterDomainStmt
            | AlterEnumStmt
            | AlterExtensionStmt
            | AlterExtensionContentsStmt
            | AlterFdwStmt
            | AlterForeignServerStmt
            | AlterForeignTableStmt
            | AlterFunctionStmt
            | AlterGroupStmt
            | AlterObjectDependsStmt
            | AlterObjectSchemaStmt
            | AlterOwnerStmt
            | AlterOperatorStmt
            | AlterPolicyStmt
            | AlterSeqStmt
            | AlterSystemStmt
            | AlterTableStmt
            | AlterTblSpcStmt
            | AlterCompositeTypeStmt
            | AlterPublicationStmt
            | AlterRoleSetStmt
            | AlterRoleStmt
            | AlterSubscriptionStmt
            | AlterTSConfigurationStmt
            | AlterTSDictionaryStmt
            | AlterUserMappingStmt
            | AnalyzeStmt
            | CallStmt
            | CheckPointStmt
            | ClosePortalStmt
            | ClusterStmt
            | CommentStmt
            | ConstraintsSetStmt
            | CopyStmt
            | CreateAmStmt
            | CreateAsStmt
            | CreateAssertStmt
            | CreateCastStmt
            | CreateConversionStmt
            | CreateDomainStmt
            | CreateExtensionStmt
            | CreateFdwStmt
            | CreateForeignServerStmt
            | CreateForeignTableStmt
            | CreateFunctionStmt
            | CreateGroupStmt
            | CreateMatViewStmt
            | CreateOpClassStmt
            | CreateOpFamilyStmt
            | CreatePublicationStmt
            | AlterOpFamilyStmt
            | CreatePolicyStmt
            | CreatePLangStmt
            | CreateSchemaStmt
            | CreateSeqStmt
            | CreateStmt
            | CreateSubscriptionStmt
            | CreateStatsStmt
            | CreateTableSpaceStmt
            | CreateTransformStmt
            | CreateTrigStmt
            | CreateEventTrigStmt
            | CreateRoleStmt
            | CreateUserStmt
            | CreateUserMappingStmt
            | CreatedbStmt
            | DeallocateStmt
            | DeclareCursorStmt
            | DefineStmt
            | DeleteStmt
            | DiscardStmt
            | DoStmt
            | DropAssertStmt
            | DropCastStmt
            | DropOpClassStmt
            | DropOpFamilyStmt
            | DropOwnedStmt
            | DropPLangStmt
            | DropStmt
            | DropSubscriptionStmt
            | DropTableSpaceStmt
            | DropTransformStmt
            | DropRoleStmt
            | DropUserMappingStmt
            | DropdbStmt
            | ExecuteStmt
            | ExplainStmt
            | FetchStmt
            | GrantStmt
            | GrantRoleStmt
            | ImportForeignSchemaStmt
            | IndexStmt
            | InsertStmt
            | ListenStmt
            | RefreshMatViewStmt
            | LoadStmt
            | LockStmt
            | NotifyStmt
            | PrepareStmt
            | ReassignOwnedStmt
            | ReindexStmt
            | RemoveAggrStmt
            | RemoveFuncStmt
            | RemoveOperStmt
            | RenameStmt
            | RevokeStmt
            | RevokeRoleStmt
            | RuleStmt
            | SecLabelStmt
            | SelectStmt
            | TransactionStmt
            | TruncateStmt
            | UnlistenStmt
            | UpdateStmt
            | VacuumStmt
            | VariableResetStmt
            | VariableSetStmt
            | VariableShowStmt
            | ViewStmt
            | /*EMPTY*/
                { $$ = NULL; }
        ;
/*****************************************************************************
 *
 * CALL statement
 *
 *****************************************************************************/
CallStmt:    CALL func_application
                {
                    CallStmt *n = makeNode(CallStmt);
                    n->funccall = castNode(FuncCall, $2);
                    $$ = (Node *)n;
                }
        ;
...

簡單解析如下:
1.stmtblock
stmtblock: stmtmulti
stmtblock為起始符號,最終應(yīng)折疊(reduce)為該符號,否則會有語法錯誤.
執(zhí)行的邏輯是:pg_yyget_extra(yyscanner)->parsetree = $1;
亦即完成語法解析,生成語法解析樹parsetree.

2.stmtmulti
tmtmulti: stmtmulti ‘;’ stmt
左遞歸產(chǎn)生式,PG可接受多個以分號”;”分隔的語句,每個語句的定義為stmt

3.stmt


stmt :
            AlterEventTrigStmt
            | AlterCollationStmt
            ...
            | SelectStmt
            ...

stmt包括N多種語句,我們看最常見的SelectStmt語句

4.SelectStmt


SelectStmt: select_no_parens            %prec UMINUS
            | select_with_parens        %prec UMINUS
        ;
...
select_no_parens:
            simple_select                        { $$ = $1; }
            | select_clause sort_clause
                {
                    insertSelectOptions((SelectStmt \*) $1, $2, NIL,
                                        NULL, NULL, NULL,
                                        yyscanner);
                    $$ = $1;
                }
...
simple_select:
            SELECT opt_all_clause opt_target_list
            into_clause from_clause where_clause
            group_clause having_clause window_clause
                {
                    SelectStmt \*n = makeNode(SelectStmt);
                    n->targetList = $3;
                    n->intoClause = $4;
                    n->fromClause = $5;
                    n->whereClause = $6;
                    n->groupClause = $7;
                    n->havingClause = $8;
                    n->windowClause = $9;
                    $$ = (Node \*)n;
                }
            | SELECT distinct_clause target_list
...

三、參考資料

Flex&Bison

向AI問一下細節(jié)

免責聲明:本站發(fā)布的內(nèi)容(圖片、視頻和文字)以原創(chuàng)、轉(zhuǎn)載和分享為主,文章觀點不代表本網(wǎng)站立場,如果涉及侵權(quán)請聯(lián)系站長郵箱:is@yisu.com進行舉報,并提供相關(guān)證據(jù),一經(jīng)查實,將立刻刪除涉嫌侵權(quán)內(nèi)容。

AI