1. Homepage
  2. Programming
  3. JC4003: Natural Language Processing - Group Assessment: Understanding and Generating Explanations from the RuozhiBa Dataset

JC4003: Natural Language Processing - Group Assessment: Understanding and Generating Explanations from the RuozhiBa Dataset

Engage in a Conversation
ABDNJC4003Natural Language ProcessingRuozhiBaData Annotation

NLP Group Assessment: CourseNana.COM

Understanding and Generating Explanations from the RuozhiBa Dataset CourseNana.COM

JC4003 NLP CourseNana.COM

1 Assessment Overview CourseNana.COM

In this group assessment, you will explore and experiment with traditional machine learning and deep learning models, including large language models (LLMs), to generate accurate meanings and explanations for the samples provided in the RuozhiBa dataset. The purpose of this exercise is to apply your knowledge from the course to a real-world dataset, practicing your skills in data annotation, model design, and evaluation. CourseNana.COM

2 Objectives
2.1 Data Annotation CourseNana.COM

Each student will be responsible for annotating a portion of the RuozhiBa dataset in Chinese to provide clear explanations for each sample. This will aid in understanding the actual meaning behind the data samples. CourseNana.COM

2.2 Model Training CourseNana.COM

Working in groups, you will build models to generate accurate meanings or explanations for unseen data samples using the annotated dataset. You may choose from traditional machine learning methods, deep learning models, or large language models. CourseNana.COM

2.3 Model Evaluation
Your model’s performance will be assessed using both automatic evaluation metrics, CourseNana.COM

such as BLEU and ROUGE, and human evaluation conducted by your group. 2.4 Presentation & Report CourseNana.COM

Each group will present their model, explain their design choices, and demonstrate their models performance. Additionally, you will submit a detailed report on your process. CourseNana.COM

3 Steps & Requirements 3.1 Data Annotation CourseNana.COM

The dataset will be divided, and each student will receive a subset for annotation with everyone’s student ID as filename. You should annotate each data sample in Chinese, explaining its meaning in clear and concise terms. These annotated samples will be combined to form the final dataset, which will be split into training and test datasets. CourseNana.COM

Here are some examples to help you understand better: CourseNana.COM

Example 1: CourseNana.COM

  • Original data: 根据牛顿第一定律,我推算出本次世界百大物理学家排名,爱因斯坦 只能屈居第二 CourseNana.COM

  • Annotated result: 牛顿第一定律本意指的是牛顿提出的第一条被公认的物理定律, 而不是牛顿排名第一的定律 CourseNana.COM

    Example 2: CourseNana.COM

    • Original data: 浴霸打了一个响指,给全世界一半的人洗了澡
    • Annotated result: 这里浴霸打响指借用了复仇者联盟中灭霸一个响指可以消灭全世 CourseNana.COM

       界一半人口的概念进行类比,因此有了给全世界一半的人洗澡的结果
    

    3.2 Forming Groups CourseNana.COM

    You will form groups of 4-6 students. Each group will work collaboratively on building a model to generate the meanings for the data samples. Group formation is flexible within each programme (cross-programme group is not allowed), but must be completed by Monday, September 23, 2024. Each group leader should send the member list to the corresponding course coordinator after the deadline. CourseNana.COM

3.3 Model Development CourseNana.COM

You are free to use traditional machine learning models (e.g., Naive Bayes, Logis- tic Regression) or deep learning models (e.g., RNN, LSTM, Transformers). For those interested in LLMs, the recommended approach is to use prompt engineering techniques to guide the LLM in generating accurate meanings for the dataset samples. Groups that wish to challenge themselves can attempt to fine-tune LLMs using the annotated RuozhiBa dataset to improve their model’s performance. CourseNana.COM

3.4 Model Evaluation CourseNana.COM

Each group’s model will be evaluated using: CourseNana.COM

Automatic evaluation metrics: such as BLEU, ROUGE, and other applicable metrics to assess the generated meanings’ accuracy. CourseNana.COM

Human evaluation: where your group will assess the quality of the outputs based on specific criteria (e.g., fluency, accuracy, relevance). CourseNana.COM

3.5 Presentation & Report CourseNana.COM

Each group will prepare a presentation to explain the design of their model, demonstrate its performance on the test dataset, and discuss challenges faced and solutions implemented. CourseNana.COM

You will also submit a detailed report that covers: CourseNana.COM

  • Introduction & Objectives: Why you chose your model(s) and what you aimed to CourseNana.COM

    achieve. CourseNana.COM

  • Methodology: A step-by-step explanation of your approach, from annotation to model design and training. CourseNana.COM

  • Experiments & Results: Your evaluation results, observations, and any adjustments you made to improve your model. CourseNana.COM

  • Discussion: Insights, challenges, and future work you would consider. CourseNana.COM

    The report should be between 3000-5000 words and include references to the tools, li- braries, and models you used. Each group member’s contribution and percentage should be highlighted at the beginning of the report. CourseNana.COM

    4 Evaluation Criteria 4.1 Data Annotation (20%) CourseNana.COM

    Goal: Evaluate the clarity, accuracy, and comprehensiveness of the annotated explanations for the RuozhiBa dataset samples. CourseNana.COM

    Criteria Weight Description CourseNana.COM

Clarity of Explanation CourseNana.COM

The annotations should provide clear, eas- ily understandable explanations of each sam- ple’s meaning. No ambiguity or vagueness should be present. CourseNana.COM

Accuracy of Annotation CourseNana.COM

The meaning of each sample should be anno- tated correctly in line with its context. This involves capturing nuances and key elements accurately. CourseNana.COM

Consistency of Terminology CourseNana.COM

The use of terminology should be consistent throughout the annotations, especially when describing similar concepts across different samples. CourseNana.COM

Completeness CourseNana.COM

All samples in the assigned portion of the dataset should be annotated. No gaps or skipped samples should be present. CourseNana.COM

4.2 Presentation (40%)
Goal: Assess how effectively the group explains their methodology, model design, and results CourseNana.COM

in a clear, professional manner. CourseNana.COM

Criteria Weight Description CourseNana.COM

Introduction & Objectives CourseNana.COM

The group provides a clear introduction to their approach, objectives, and rationale for choosing their models and methods. CourseNana.COM

Methodology Explanation CourseNana.COM

Clear and logical explanation of the method- ology. This includes the model design, choice of algorithms, training processes, and evalu- ation setup. CourseNana.COM

Results & Demonstration CourseNana.COM

The group demonstrates their models per- formance on the test dataset. Includes a clear discussion of automatic and human evalua- tion metrics. CourseNana.COM

Visual Aids & Communica- tion CourseNana.COM

The presentation is well-organized, with clear slides, diagrams, or visual aids. The group communicates confidently and explains key points clearly. CourseNana.COM

Q&A Handling CourseNana.COM

The group effectively handles questions from the audience or instructor, demonstrating understanding of their model and results. CourseNana.COM

4.3 Report (40%)
Goal: Assess the depth of the groups understanding, analytical rigor, and ability to CourseNana.COM

communicate their work in a structured, professional format. CourseNana.COM

Criteria Weight Description CourseNana.COM

Introduction & Objectives CourseNana.COM

The introduction clearly defines the group’s goals, the problem they are solving, and the approach they are taking. CourseNana.COM

CourseNana.COM

Dataset & Annotation Process CourseNana.COM

Clear explanation of the RuozhiBa dataset and the groups approach to annotation. Discussion of challenges faced during anno- tation (if any). CourseNana.COM

Model Selection & Methodology CourseNana.COM

Detailed description of the model(s) selected, the rationale for choosing them, and the methodology used. Includes data preprocess- ing steps, model architecture, and training processes. CourseNana.COM

Experiments & Results CourseNana.COM

Presentation of experiments conducted, re- sults obtained (using BLEU, ROUGE, and human evaluation). Includes insightful anal- ysis of the results, including error analysis and discussion of challenges. CourseNana.COM

Discussion & Critical Analysis CourseNana.COM

Discussion of the models strengths, weak- nesses, and potential improvements. In- cludes a critical evaluation of why certain de- cisions were made, including potential trade- offs. CourseNana.COM

Future Work & Improve- ments CourseNana.COM

The group provides a thoughtful discussion of how the work could be extended or im- proved in the future. CourseNana.COM

Structure, Writing & Clar- ity CourseNana.COM

The report is well-structured, with clear sec- tions, proper use of headings, and profes- sional writing. No major grammatical or spelling errors. CourseNana.COM

4.4 Final Breakdown Data Annotation: 20% • Presentation: 40%
Report: 40% CourseNana.COM

Get in Touch with Our Experts

WeChat (微信) WeChat (微信)
Whatsapp WhatsApp
ABDN代写,JC4003代写,Natural Language Processing代写,RuozhiBa代写,Data Annotation代写,ABDN代编,JC4003代编,Natural Language Processing代编,RuozhiBa代编,Data Annotation代编,ABDN代考,JC4003代考,Natural Language Processing代考,RuozhiBa代考,Data Annotation代考,ABDNhelp,JC4003help,Natural Language Processinghelp,RuozhiBahelp,Data Annotationhelp,ABDN作业代写,JC4003作业代写,Natural Language Processing作业代写,RuozhiBa作业代写,Data Annotation作业代写,ABDN编程代写,JC4003编程代写,Natural Language Processing编程代写,RuozhiBa编程代写,Data Annotation编程代写,ABDNprogramming help,JC4003programming help,Natural Language Processingprogramming help,RuozhiBaprogramming help,Data Annotationprogramming help,ABDNassignment help,JC4003assignment help,Natural Language Processingassignment help,RuozhiBaassignment help,Data Annotationassignment help,ABDNsolution,JC4003solution,Natural Language Processingsolution,RuozhiBasolution,Data Annotationsolution,