【論文メモ】Generating Semantically Precise Scene Graphs from Textual Descriptions for Improved Image Retrieval

JavaScriptを有効にしてください

【論文メモ】Generating Semantically Precise Scene Graphs from Textual Descriptions for Improved Image Retrieval

📅 2022/8/24 · ☕ 2 min read

Stanford Scene Graph Parserの論文 (ACL 2015)
- 一応, scene graphを自動化してimage retrievalできるようにしようという趣旨
- https://nlp.stanford.edu/software/scenegraph-parser.shtml
流れ
- ①Universal Dependenciesを一部修正したものをsemantic graphとして生成
  - 1. a lot of 等のquantificational modifiersの修正
  - 1. 代名詞の解釈
  - 1. 複数名詞への対応 → ノードを増やす
- ②rule-based or classifier-basedなparserでsemantic graphからobject, relations, attributesを抽出
  - rule-based parserはsegmexを使用(後述)
  - classifier-based parser
    - objectのclassとrelationを予測
  - どっちかのparserを使ってscene graphを生成
- ③MAP推定により, objectとbboxの対応関係を推定&スコア化し, image retrievalを実行
  - ここについてはImage Retrieval using Scene Graphs (CVPR15)に記載
rule-based parser
- ９つのルールを定義
  - Adjectival modifiers
    - 形容詞的修飾語
  - Subject-predicate-object constructions and subject-predicate constructions without an object
    - 主語-述語-目的語構文と主語と述語の構文で、目的語がない場合オブジェクトのない主語述語構文
  - Copular constructions
    - 共起語構文 (コピュラ (copula)？)
  - Prepositional phrases
    - 前置詞句
  - Possessive constructions
    - 所有格の構文
  - Passive constructions
    - 受動構文
  - Clausal modifiers of nouns
    - 名詞の節付け修飾語
- 具体的には下のようにsegmexで定義されている

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26


  /* A man is riding a horse. */
  public static SemgrexPattern SUBJ_PRED_OBJ_TRIPLET_PATTERN = SemgrexPattern.compile("{}=pred >nsubj {tag:/NNP?S?/}=subj >/(iobj|dobj|nmod:.*)/=objreln {tag:/NNP?S?/}=obj !> cop {}");

  /* A woman is smiling. */
  public static SemgrexPattern SUBJ_PRED_PAIR_PATTERN = SemgrexPattern.compile("{}=pred >nsubj {tag:/NNP?S?/}=subj !>/(iobj|dobj|nmod:.*)/ {tag:/NNP?S?/} !>cop {}");

  /* The man is a rider. */
  public static SemgrexPattern COPULAR_PATTERN = SemgrexPattern.compile("{}=pred >nsubj {tag:/NNP?S?/}=subj >cop {}");

  /* A smart woman. */
  public static SemgrexPattern ADJ_MOD_PATTERN = SemgrexPattern.compile("{}=obj >/(amod)/ {}=adj");

  /* The man is tall. */
  public static SemgrexPattern ADJ_PRED_PATTERN = SemgrexPattern.compile("{tag:/J.*/}=adj >nsubj {}=obj");

  /* A woman is in the house. */
  public static SemgrexPattern PP_MOD_PATTERN = SemgrexPattern.compile("{tag:/NNP?S?/}=gov >/nmod:.*/=reln {}=mod");

  /* His watch. */
  public static SemgrexPattern POSS_PATTERN = SemgrexPattern.compile("{tag:/NNP?S?/}=gov >/nmod:poss/=reln {tag:/NNP?S?/}=mod");

  /*   */
  public static SemgrexPattern AGENT_PATTERN = SemgrexPattern.compile("{tag:/V.*/}=pred >/nmod:agent/=reln {tag:/NNP?S?/}=subj >nsubjpass {tag:/NNP?S?/}=obj ");

  /* A cat sitting in a chair. */
  public static SemgrexPattern ACL_PATTERN = SemgrexPattern.compile("{}=subj >acl ({tag:/V.*/}=pred >/(iobj|dobj|nmod:.*)/=objreln {tag:/NNP?S?/}=obj)");

著者

YuWd (Yuiga Wada)

機械学習・競プロ・iOS・Web

【論文メモ】Generating Semantically Precise Scene Graphs from Textual Descriptions for Improved Image Retrieval

関連記事