1. Homepage
  2. Programming
  3. Machine Learning in Practice Assignment: ML Solutions for Misinformation Detection in Social Media

Machine Learning in Practice Assignment: ML Solutions for Misinformation Detection in Social Media

Engage in a Conversation
UQMachine Learning in PracticePythonExplorationPreparationFeature GenerationClassification

Machine Learning in Practice – 2024 S1 – Assignment CourseNana.COM

ML Solutions for Misinformation Detection in Social Media CourseNana.COM

ML Project: Jupyter Lifecycle Expedition CourseNana.COM

Machine Learning in Practice – 2024 S1 – Assignment Introduction CourseNana.COM

Social media, particularly X (formally known as Twitter), has revolutionized the way information spreads, but it's also an incubator for fake news and misinformation. Misinformation on platform X can evolve from diverse forms and may stem from various sources, whether intentional or not, taking advantage of the platform's viral nature to widen its dissemination. As we approach major events like elections, the urgency to address this challenge becomes increasingly apparent. As there is no specific form that misinformation is presented in, there is an increasing need to develop more innovative and novel approaches to addressing it. CourseNana.COM

Machine learning and natural language processing (NLP) offer promising solutions to identify trends and detect misinformation. However, free-text data is challenging to incorporate into classification models due to its lack of structure. To overcome this challenge, latent variable models such as topic models or feature generation can be used to infer intermediary representations that can be used as structured data for classification tasks. CourseNana.COM

In this project, you will showcase the significance of integrating data sourced from X alongside newly engineered features to classify the authenticity of news-related tweets. A dataset obtained from X has been web-scraped, and the various sections of this assignment will establish one kind of exploratory strategy for addressing a classification challenge. CourseNana.COM

Machine Learning in Practice – 2024 S1 – Assignment Dataset CourseNana.COM

The Assignment dataset consists of an assortment of news headlines, along with associated X posts relating to the headline. The dataset consists of 134,198 rows and 15 columns. There are 3 types of feature variables and only 1 target variable: CourseNana.COM

Feature Variables CourseNana.COM

➢ Textual Data: CourseNana.COM

  1. news_author (str) author of a news headline. CourseNana.COM

  2. news_headline (str) – headline of a news article. CourseNana.COM

  3. related_tweet (str) – X post relating to the news headline posted by a user. CourseNana.COM

➢ Post Metadata CourseNana.COM

  1. post_replies (int) - number of replies on the post. CourseNana.COM

  2. post_retweets (int) - number of retweets on the post. CourseNana.COM

  3. post_favourites (int) - number of favourites on the post. CourseNana.COM

  4. post_quotes (int) - number of times the post has been quote tweeted. CourseNana.COM

➢ User Metadata CourseNana.COM

  1. user_followers (int) - number of followers. CourseNana.COM

  2. user_following (int) - number of following users. CourseNana.COM

  3. user_friends (int) - number of friends (mutual following). CourseNana.COM

  4. user_tweet_count (int) – total number of tweets the user has made. CourseNana.COM

  5. user_favourites_count (int) – total number of favourites user has across all tweets. CourseNana.COM

  6. user_mentions (int) – total number of of users mentioned (@) in related_tweet CourseNana.COM

  7. user_tweet_count_lists (int) – total number of tweets the user has in their lists. CourseNana.COM

Target Variable CourseNana.COM

➢ Misinformation (bool) – a T/F value representing if a tweet is false. CourseNana.COM

Machine Learning in Practice – 2024 S1 – Assignment CourseNana.COM

Specification Summary CourseNana.COM

  • Type: Project report, individual assignment CourseNana.COM

  • Deliverable: Report in the format of Python script only (.ipynb) CourseNana.COM

    The aim of this assignment is to provide you with experience in the steps involved in text preparation, feature generation, and creating, evaluating, and improving classification models. You will need to research NLP, and python functionalities if you aim to achieve excellent marks and discover innovative techniques/methods. CourseNana.COM

    Exploration, Preparation & Feature Generation CourseNana.COM

    This section requires you to explore various aspects of your dataset and prepare the data for future sections. It is important you take time to carefully explore your data and make decisions on preparation or generation that make sense. CourseNana.COM

    Preprocessing steps are essential to clean and standardize data before feature generation and enhance the quality of extracted features. Classification models that harness generated features may enable models to better understand and analyze data or to better learn patterns and relationships, compared to regular models. CourseNana.COM

    Further, X or Twitter recently open sourced their algorithms and many articles provide insights into what features of a tweet are important. Knowing this may help to better understand how to classify a tweet as misinformation. CourseNana.COM

    Your task is to
    ➢ Explore and prepare your data. CourseNana.COM

    Inthistask,youcouldperformthenecessarycleaningandpre-processingtasks,explore or try to understand and profile your data through various techniques (i.e. clustering, topic modelling, etc.). CourseNana.COM

    ➢ Generate new features from your data.
    You should have a good understanding of your data from above and can now CourseNana.COM

    experiment with feature generation. In this task you should consider what can be generated to improve your classification model. CourseNana.COM

Machine Learning in Practice – 2024 S1 – Assignment Classification (Model Building and Evaluation) CourseNana.COM

It is important to try multiple variations of features/parameters in model building to achieve the best performance. Additionally, you should elaborate on the performance metrics you have used to evaluate your model and explain why they suit the available data. CourseNana.COM

Your task CourseNana.COM

➢ Experiment developing and evaluating classification models to find a model that has the best overall performance. CourseNana.COM

Once you find the best performing model, you should only show how you built and evaluated that specific one. CourseNana.COM

➢ Elaborate on the major tasks you have undertaken to improve the best-performing model and explain why the performance metrics suit the available data. CourseNana.COM

Submission CourseNana.COM

Your report should be delivered in an .ipynb file. A notebook template is provided to show how to structure your work. You need to use the template (Assignment_Template.ipynb) and strictly follow its format which is designed based on the provided Assignment rubric. CourseNana.COM

It can be useful that add some in-line comments (using #) next to your codes to explain it briefly. CourseNana.COM

You will get a better mark if your approach is innovative. This means no other student has applied it, or a few others have applied a similar approach with some differences. Therefore, it is highly advised that you do not share your creative work with anyone else. You can still discuss preliminary ideas and help each other, just remember your submission must be your own work. CourseNana.COM

You will only need to submit one .ipynb file and should use the provided Python template file. Before submission: CourseNana.COM

  • ➢  Ensure that your code can run without errors. If your code returns an error at any point, your assignment will only be marked up until the error, and the remainder of your code won't earn any marks. Example errors may include: Syntax issues or Name Errors. CourseNana.COM

  • ➢  Make sure that all the important outputs are shown in your notebook. However, avoid showing trivial outputs. For example, you should remove codes randomly displaying the whole DataFrame, etc. CourseNana.COM

  • ➢  Your marker will first look at your generated output as a reference without running your notebook (unless deemed necessary). Therefore, your significant outputs need to be generated, and the elaboration should be provided in the notebook, as shown in the template.  CourseNana.COM

Get in Touch with Our Experts

WeChat (微信) WeChat (微信)
Whatsapp WhatsApp
UQ代写,Machine Learning in Practice代写,Python代写,Exploration代写,Preparation代写,Feature Generation代写,Classification代写,UQ代编,Machine Learning in Practice代编,Python代编,Exploration代编,Preparation代编,Feature Generation代编,Classification代编,UQ代考,Machine Learning in Practice代考,Python代考,Exploration代考,Preparation代考,Feature Generation代考,Classification代考,UQhelp,Machine Learning in Practicehelp,Pythonhelp,Explorationhelp,Preparationhelp,Feature Generationhelp,Classificationhelp,UQ作业代写,Machine Learning in Practice作业代写,Python作业代写,Exploration作业代写,Preparation作业代写,Feature Generation作业代写,Classification作业代写,UQ编程代写,Machine Learning in Practice编程代写,Python编程代写,Exploration编程代写,Preparation编程代写,Feature Generation编程代写,Classification编程代写,UQprogramming help,Machine Learning in Practiceprogramming help,Pythonprogramming help,Explorationprogramming help,Preparationprogramming help,Feature Generationprogramming help,Classificationprogramming help,UQassignment help,Machine Learning in Practiceassignment help,Pythonassignment help,Explorationassignment help,Preparationassignment help,Feature Generationassignment help,Classificationassignment help,UQsolution,Machine Learning in Practicesolution,Pythonsolution,Explorationsolution,Preparationsolution,Feature Generationsolution,Classificationsolution,