Skip to Main Content
Campus Analytics Challenge 2022: Determine Transaction Categories Using Machine Learning and Natural Language Processing
Challenge Type: computer science
See more like this check out our active challenges
$20,000
prize pool
DONE
21 months ago
Prizes: A total of $20,000 will be awarded to the 8 winners of the Campus Analytics Challenge. Below is a breakdown of the prize distribution: One (1) first place prize - $7,500 One (1) second place prize - $5,000 One (1) third place prize - $2,500 Five (5) finalist prizes - $1,000 each

NOTE - Sponsor reserves the right to award fewer than the advertised number of prizes in the event an insufficient number of submissions/solutions are received that are of reasonable quality to merit a prize.

At Wells Fargo, our data scientists play a key role in driving innovative and meaningful insights that enable our lines of business to provide a world-class experience to our stakeholders. The Campus Analytics Challenge 2022 (“Challenge”) puts you in the role of a data scientist and calls you to use Machine Learning and Natural Language Processing to predict transaction categories. The dataset is small enough that you should be able to work with it on a standard laptop.

To help get your creative juices flowing, we encourage you to explore Machine Learning and Natural Language Processing research, literature and beyond, as you may find a creative approach in other sub-fields of data science.

________________________________________________________________________________________________________________________________________

Challenge Background: It is no surprise that financial companies need to help their customers organize their finances. Customers want to know what they spend their money on to keep balances in check. By categorizing transactions and building better customer engagement tools, Wells Fargo can help customers identify frequent purchases and subscriptions, sort income and activity liability with higher accuracy, and reduce credit risks. 

Transaction categorization is the ability to recognize the purpose of a transaction based on its description. For long, this process was done manually but now technology can do it efficiently. 

This Challenge will focus on Natural Language Processing using the power of Machine Learning to predict which category a transaction will fall into, given the description of the transaction. 

Challenge Starts

                12:00:01 p.m. Eastern Time (“ET”) on 06/13/2022

Challenge Submission Deadline

                12:00:01 p.m. ET on 07/13/2022


Submissions Judged

                07/14/2022 – 08/05/2022

Potential Finalists & Winners Notified 

                08/10/2022 (on or about)

________________________________________________________________________________________________________________________________________

Challenge Objective: This Challenge will focus on Natural Language Processing using the power of Machine Learning to predict which category a transaction will fall into, given the description of the transaction. Your solution must meet:

  1. The Challenge Criteria
  2. Follow the Challenge Instructions and Requirements 
  3. Incorporate the Key Deliverables, each described in detail below.

________________________________________________________________________________________________________________________________________

Eligibility: This Challenge is sponsored by Wells Fargo Bank, N.A. (“Sponsor” or “Wells Fargo”) for full-time or part-time students, 18 years of age or older at the time of entry, who are enrolled in any higher education degree program on campus or online at colleges or universities in the United States and District of Columbia, including students attending two- and four-year programs, technical and vocational schools, junior and community colleges, as well as graduate and professional education students (collectively “Students”). 

Employees of Wells Fargo or MindSumo, Inc. and their respective parents, divisions, affiliates, subsidiaries, their promotional or marketing agencies, government entities and public officials, and their immediate family members (parent, child, sibling and spouse) and persons living in the same households of each such employee (whether related or not) are not eligible. To be eligible to receive any prize, potential winners must have a valid U.S. tax identification number and meet all the eligibility requirements at the time the prize is awarded. Potential winners will be required to provide Sponsor with proof that they meet the eligibility requirements for this Challenge. Void where prohibited by law.

Deliverables

Challenge Criteria: Build a model to predict transaction categories using the 10 (ten) distinct categories that a transaction may fall into. The categories are as follows:

• Communication Services 
• Education 
• Entertainment
• Finance 
• Health and Community Service 
• Property and Business Services 
• Retail Trade 
• Services to Transport 
• Trade, Professional and Personal Services
• Travel 

The dataset provided on the Challenge page is synthetic. This dataset has been scrubbed to replace all transaction numbers with 1's.

Training Dataset – The Training Dataset contains 40,000 unique transactions and their corresponding transaction categories. This dataset should be used to develop your solution.

Test Dataset – The Test Dataset contains 10,000 unique transactions with the transaction categories omitted. This dataset should be used to test your solution and identify the correct transaction categories.

________________________________________________________________________________________________________________________________________

Challenge Instructions and Requirements: When creating your Solution, you may use a novel combination of existing Machine Learning and/or Natural Language Processing or develop your own novel method in order to extract and/or represent thematic information from the data file. 

You must provide citations and sources for any additional data and/or methodologies used. 

________________________________________________________________________________________________________________________________________

Key Deliverables:

Deliverable 1) Describe, using an abstract, your approach and methodology. Include a visual representation of your analytic process flow.

Deliverable 2) Return the uncategorized content dataset with the categories identified.

Deliverable 3) Document your code and reference the analytic process flow-diagram from deliverable one (1).

________________________________________________________________________________________________________________________________________

Submitting as a Team: You are welcome to work as a team on the Campus Analytics Challenge. However, you should only provide ONE submission for the entire team.

Submission questions


Have you read and do you agree to the attached Campus Analytics 2022 Challenge Rules document?

(Required)
Yes

In what type of degree program are you currently enrolled?

(Required)
Associate degree
Bachelor's degree
Master's degree
Doctoral degree
Other

Please provide a phone number where you can be reached in the event that you are considered for a prize

(Required)

I understand that I CANNOT use the datasets attached to this challenge for anything other than my submission to this Campus Analytics Challenge

(Required)
Yes

Are you enrolled in one of the following programs? Check one if it applies to you. If not, please check "Other"

(Required)
MBA
Computer Science
Information Science
Mathematics/Applied Mathematics/Statistics
Engineering
Data Analytics
Chemistry
Biology
Economics
Marketing
Other

What college or university do you attend?

(Required)
Drag and drop photos or image files into your solution 0 characters Average: 1682 characters
    Your solution must be at least 200 characters long
    Contact Us